
JUNE 1989 


SPECIAL FAR EAST ISSUE 


IEEE COMPUTER SOCIETY 


, THE INSTITUTE OF ELECTRICAL AND 
, ELECTRONICS ENGINEERS, INC. 


PLUS 


The BTRON / 286 
applies the TRON Project 

HUMAN-MACHINE 
INTERFACE 

An FPU for TRON 

A Data-driven Processor for Consumer Electronics 
A Logical Tool for Relational Databases 
The Transputer T414 Instruction Set 


SPECIAL FEATURES 




G micro F32 Family, New 32-bit from Fujitsu 


F32/200, Super-Microprocessor of 7 EDN MIPS on TRON Architecture 


Features: 

•Optimized for execution of OS based on TRON specification 
•Software compatibility with F32/300 and F32/100 
• Instruction set and addressing modes optimized for high speed 
execution of high level languages 
•1.0 //m CMOS process 


High Performance Peripherals Support MPU’s High Speed Processing 



FPU: A coprocessor of F32 family MPUs for 
the execution of floating point operations at 
high speed. Operations and the data format 
of the FPU conform to the IEEE-754 standard. 

IRC: An interrupt request controller connec¬ 
table to the F32 MPU bus and the VME bus. 

The IRC integrates two functions: Interrupt Re¬ 
quest Generator (IRG) and Interrupt Request 
Handler (IRH). The IRG processes external 
interrupts to transmit to the IRH. The IRH proc¬ 
esses interrupt requests from the IRG for trans¬ 
mission to the MPU. 

IRC 


DMAC: A direct memory access controller that 
can access up to 4G bytes of address space. 
The DMAC provides an expansion bus, in addi¬ 
tion to a local bus, and can supports communi¬ 
cation between these two buses. The transfer 
unit is selectable: byte, half word, word or long 
word. Data of different sizes can be transferred 
from any address boundary. 

TAGM: A tag memory that constructs an exter¬ 
nal cache using high-speed SRAM. TAGM con¬ 
sists of a tag memory cell array, a replacement 
logic and comparator and a set associative. 




Perfect Development Environment for Both Programming and Debugging 


Emulator: The emulator for the F32/200 con¬ 
sists of the Host Emulator software (EML32) 
that resides in the host computer and the In- 
Circuit Emulator hardware (MB2151). Together 
they perform software-hardware integrated 
system debugging. 


Single Board Computer (SBC): High-perform¬ 
ance SBC includes both an F32/200 MPU and a 
FPU. With monitor program on the board, the 
SBC supports basic debugging functions. 

Cross software: C Compiler, Assembler, Link¬ 
age Editor, Librarian, Load Module Converter, 
Simulator Debugger 


FUJITSU 


For further information please contact: 

FUJITSU LIMITED (Semiconductors Marketing): 

Furukawa Sogo Bldg., 6-1, Marunouchi 2-chome, Chiyoda-ku, Tokyo 100, Japan Phone: National (03) 216-3211 International (Int'l Prefix) 81-3-216-3211 Telex: 2224361FT TOR J 
FUJITSU MICROELECTRONICS, INC.: 3545 North First Street, San Jose, CA 95134-1804, U.S.A. Phone: 406-922-9000 TWX: 910-671-4915 
FUJITSU MIKROELEKTRONIK GmbH: Arabella Center 9, OG./A, Lyoner StraBe 44-48 D-6000 Frankfurt 71, F.R, Germany Phone: 69-66-320 Telex: 411963 FMG D 
FUJITSU MICROELECTRONICS PACIFIC ASIA LIMITED: 805 Tsim Sha Tsui Centre, West Wing 66 Mody Road, Kowloon, Hong Kong Phone: 3-732 0100 Telex: 31959 FUJIS HX 


Reader Service Number 1 









Y)urStairwaylo 
Software Heaven. 





The pressure on software devel 
opers to produce has never been greater. 
Yet there they sit, often reinventing 
the wheel, with more computational 
power than ever at their fingertips and 
the clock ticking away. Product delivery 
deadlines? So much pie in the sky. 

The problem? How best to put 
all that PC CPU capacity to good use. 

The solution: the Visible Analyst 
Workbench. 

The Visible Workbench makes 
the full power of CASE accessible to 
everyone. Running as a multi-user tool 
on Novell LANs or on individual PC 
workstations, the Visible Analyst 
Workbench lets teams of software en¬ 
gineers work together — and simul¬ 
taneously — on large scale devel¬ 
opment projects. It makes 
after-the-fact piecing 
together of specifica¬ 
tions a thing of the 
past. And, as our cus 


tomers delight in telling us, it’s so 
easy to learn and use that people 
begin working more productively 
on “day one” 

The Visible Analyst 
Workbench delivers real devel¬ 
opment power. The power of 
tomated, linked structured analysis 
and design. The power of an 
automated data repository. The 
power of prototyping. The 
power of instantaneous commu- 
ication and shared data between 
project members. Th< 
power of accurate, 
validated high level 
specifications with 


ities. And, because it is a CASE tool, 
its usefulness will seem everlasting. 


full documentation. And, soon, bridges 
to code generation, completing the 
promised CASE fink “from pictures 
to code”. 

In fact, the 
Visible Analyst 
Workbench is 
the only PC- 
based CASE tool that 
combines ease of 
use, self-implemen¬ 
tation, cost effec¬ 
tiveness and true 
multi-user capabil¬ 


Best of all, our step-by- 
step product growth path 
lets you begin building inter¬ 
nal CASE resources at 
down-to-earth prices 
starting under $300. 
The Visible 
Analyst Workbench. 
Start building your stair¬ 
way to software heaven today. 


Down to earth prices 


Professional Series — For large, multi¬ 
project systems development: 


3-Node LAN Pak . .. 
per additional node 


Stand-alone version 
with Prototyper . .. 


$ 3,500 
700 


1,785 

2,380 


Personal and Educational Series — 
For small project development and 
educational needs. 


Personal Edition 


Educational and Training 
Version . 295 



The Bay Colony Corporate Center • 950 Winter St. • Waltham, MA 02154 USA 
(617) 890-CASE FAX (617) 890-8909 Telex 261102 VSCUR 

©1989 Visible Systems Corporation Visible, Visible Analyst Workbench, Visible Solution, The Visible Analyst and Visible Systems Corporation are registered Trademarks of Visible Systems Corporation 


Reader Service Number 2 








































IEEE Micro 


Editor-in-Chief 

Joe Hootman 

University of North Dakota* 

Editorial Board 

Shmuel Ben-Yaakov 

Ben Gurion University of the Negev 

Dante Del Corso 
Politecnico di Torino, Italy 

John Crawford 
Intel Corporation 

Stephen A. Dyer 
Kansas State University 

K.-E. Grosspietsch 
GMD, Germany 

David B. Gustav son 

Stanford Linear Accelerator Center 

Victor K.L. Huang 
AT&T Information Systems 

Barry W. Johnson 
University of Virginia 

David K. Kahaner 
National Bureau of Standards 

Jay Kamdar 

National Semiconductor Corporation 

Hubert D. Kirrmann 

Asea Brown Boveri Research Center 

Kenneth Majithia 
IBM Corporation 

Richard Mateosian 

Marlin H. Mickle 
University of Pittsburgh 

Varish Panigrahi 

Digital Equipment Corporation 

Ken Sakamura 
University of Tokyo 

Richard H. Stem 

Yoichi Yano 
NEC Corporation 


* Submit six copies of all articles and 
special-issue proposals to Joe 
Hootman, EE Dept., University of 
North Dakota, PO Box 7165, Grand 
Forks, ND 58202; (701)777-4331 
Compmail+ j. hootman 


Magazine Advisory Committee 

Sushil Jojodia (chair) 

Jon T. Butler 
Sumit DasGupta 
Joe Hootman 
Ted Lewis 
David Pessel 
H.T. Seaborn 
Bruce D. Shriver 
John Staudhammer 


Publications Board 

Sallie Sheppard (chair) 

James J. Farrell III (vice chair) 

James H. Aylor 

Victor Basili 

P. Bruce Berra 

J. Richard Burke 

J. T. Cain 

David Choy 

James Cross 

Sumit DasGupta 

Joe Hootman 

Anil K. Jain 

Glen G. Langdon 

Ted Lewis 

Ming T. Liu 

David Pessel 

C.V. Ramamoorthy 

Bruce D. Shriver 

John Staudhammer 

Harold Stone 

Steven L. Tanimoto 


Staff 

Editor and Publisher 
H.T. Seaborn 

Assistant Publisher 
Douglas Combs 

Managing Editor 
Marie English 

Assistant Editor 
Christine Miller 

Assistant to the Publisher 
Pat Paulsen 

Art Director 
Jay Simpson 

Design! Production 
Tricia Hayden 

Membership/Circulation Manager 
Christina Champion 

Advertising Coordinators 
Heidi Rex, Marian Tibayan 

Reader Semice 
Marian Tibayan 


IEEE Computer Society 

10662 Los Vaqueros Circle 
Los Alamitos, California 90720 
(714) 821-8380 


Circulation: IEEE Micro (ISSN 
0272-1732) is published bimonthly 
by the IEEE Computer Society, 
10662 Los Vaqueros Circle, Los 
Alamitos, CA 90720-2578; IEEE 
Computer Society Headquarters, 
1730 Massachusetts Ave., NW, 
Washington, DC 20036-1903; IEEE 
Headquarters, 345 East 47th St., 
New York, NY 10017. Annual sub¬ 
scription: $18 in addition to IEEE 
Computer Society or any other 
IEEE society member dues; $33 for 
members of other technical organi¬ 
zations. Back issues: $10 for mem¬ 
bers; $20 for nonmembers. This 
journal is also available in micro¬ 
fiche form. 

Postmaster: Send address changes 
and undelivered copies to IEEE 
Micro, 10662 Los Vaqueros Circle, 
Los Alamitos, CA 90720-2578. 
Second-class postage is paid at 
New York, NY, and at additional 
mailing offices. 

Copyright and reprint permis¬ 
sions: Abstracting is permitted with 
credit to the source. Libraries are 
permitted to photocopy beyond the 
limits of US Copyright Law for 
private use of patrons those post- 
1977 articles that carry a code at 
the bottom of the first page, pro¬ 
vided the per-copy fee indicated in 
the code is paid through the Copy¬ 
right Clearance Center, 29 Con¬ 
gress St., Salem, MA 01970. 
Instructors are permitted to photo¬ 
copy isolated articles for noncom¬ 
mercial classroom use without fee. 
For other copying, reprint, or 
republication permission, write to 
Permissions Editor, IEEE Micro , 
10662 Los Vaqueros Circle, Los 
Alamitos, CA 90720-2578. Copy¬ 
right © 1989 by the Institute of 
Electrical and Electronics Engi¬ 
neers, Inc. All rights reserved. 

Editorial: Unless otherwise stated, 
bylined articles and descriptions of 
products and services reflect the 
author’s or firm’s opinion; inclu¬ 
sion in this publication does not 
necessarily constitute endorsement 
by the IEEE or the IEEE Computer 
Society. Send editorial correspon¬ 
dence to IEEE Micro, 10662 Los 
Vaqueros Circle, Los Alamitos, CA 
90720-2578. All submissions are 
subject to editing for style, clarity, 
and space considerations. IEEE 
Micro subscribes to the Computer 
Press Association’s code of profes¬ 
sional ethics. 

VBPA 


2 IEEE MICRO 






Cover illustration by Hajime 
Sorayama and Image Bank West 
Cover design by Hayden Design 
and Design & Direction 



Published by the IEEE Computer Society 


Departments 


Feature Articles 


Letters to the Editor 


4 


Guest Editor’s Introduction 

Ken Sakamura 



From the Editor-in-Chief 


5 


Micro World 

Neural computing: The new gold rush 


7 


An Overview of the BTRON/286 Specification 

Ken Sakamura, Yoshiaki Kushiki, and Kazuhiro Oda 

While individuals may find the 32-/64-bit TRON Project’s systems too 
costly, this 16-bit version should be more affordable and fill business 
data and graphics needs. 



Micro Review 

The USA in today’s world; digital 
filters; Hyperdictionary 


9 


Micro Law 

User interfaces and screen 
displays. Part I 


84 


New Products 

RISCs; virus protection 



A Floating-Point VLSI Chip for the TRON Architecture: 
An Architecture for Reliable Numerical Programming 

Shumpei Kawasaki, Mitsuru Watabe, and Shigeki Morinaga 

Preliminary evaluations of the FPU show a good combination of 
precision and performance. It supports IEEE floating-point arithmetic 
and the ANSI C draft proposal. 



The Data-Driven Microprocessor 

Shinji Komori, Kenji Shima, Souichi Miyata, Toshiya Okamoto, 
and Hiroaki Terada 

This five-chip set performs real-time signal and distributed parallel 
processing without a system clock or centralized control unit. 



Advertiser/Product Index 




Special Features 


Micro View 

Design choices power the Next wave 


96 


The Transputer T414 Instruction Set 

Jean-Daniel Nicoud and Andrew Martin Tyrrell 

Have you wondered how to decipher the transputer instruction set? 
Here’s your chance to compare it with something you know. 



IEEE Computer 
Society Information 


Cover 3 


Reader Interest/Service/Subscription cards, 
p. 64A; Change-of-Address form, p. 11 


A Logical Design Tool for Relational Databases 

M . Mehdi Owrang O. and W. Garnini Gunaratna 

Here’s a way to prevent anomalies from affecting the design 
of your relational database. 




























Department 


Letters 


RISC tutorial comments 

To the Editor: 

I’m sorry, but I’m about to become 
one of those guys who forgets to tell 
you when you’ve done a great job and 
only complains when something bugs 
him enough. 

What bugged me enough was 
Lazzerini’s RISC article in IEEE Micro, 
Feb. 1989, pp. 57-65. I can’t quite see 
why you found this article worthy of 
publication. There isn’t one new idea in 
it. It’s a rehash of things that were all 
said before by more eloquent voices. It 
seems to me that anyone who wants to 
discuss RISC nowadays must concen¬ 
trate on machines such as the MIPS 
R3000, HP Precision, [Sun Microsys¬ 
tems] Sp*\rc, AMD Am29000, Motorola 
88000, etc. If one wants to go over thor¬ 
oughly trod ground such as that repre¬ 


sented by the early academic RISC ma¬ 
chines, it would behoove one to bring 
some new and original insights to the 
treading. If there was anything original 
in Lazzerini’s article, it escaped me. 

Lazzerini didn’t even define her 
terms, which could mislead the reader 
into thinking there is actually some con¬ 
sensus on what RISC means. Personally, 
I think there is an emerging consensus, 
but it appeared nowhere in Lazzerini’s 
article, nor [did] any mention of why it 
is that machines that all claim the RISC 
label can differ as much as the AMD, 
Motorola, MIPS, and Sparc do. This ar¬ 
ticle was just not good enough and 
shouldn’t have appeared in Micro. But 
thanks for the others, they were worth 
reading. 

Robert P. Colwell 
Branford, CT 


To the Reader: 

IEEE Micro also received a letter 
from Daniel Tabak concerning the same 
article. He has kindly allowed us to edit 
the letter and approved the editing. 
Tabak makes the following points: 

The material in the Lazzerini article is 
available from other sources such as a 
tutorial by Stallings, “Reduced Instruc¬ 
tion Set Computer Architecture,” Proc. 
IEEE, Vol. 76, No. 1, Jan. 1988, pp. 38- 
48; and a book by Hayes, 2nd ed., 
McGraw-Hill, New York, 1988. 

There are several sources of RISC 
processors available on the European 
market, which are good RISC models 
(Transputer, Acorn, Metaforth) and 
which were not covered in this article. 
More information can be found on these 
processors in Tabak’s RISC Architec¬ 
ture, Research Studies Press Ltd., UK, 
and John Wiley & Sons, New York, 

1987, and in an article entitled “RISC 
Systems” in Microprocessors and Mi¬ 
crosystems, Butterworth & Co., May 

1988, by Tabak. 

The Katevenis dissertation was 
published as a book by MIT Press, Cam¬ 
bridge, Mass., in 1985. 

Lazzerini’s “The RISC Approach” 
drew some other responses. Of the 26 
other readers who have read and as¬ 
sessed the Special Feature article so far, 
three rated it as being of low interest, 
seven rated it of moderate interest, and 
16 rated it of high interest. Some other 
readers commented on the article; their 
comments appear in this issue under the 
In the Mailbag heading on p. 6. 

We welcome other comments from 
readers on this and any article appearing 
in IEEE Micro. 

Joe Hootman 
Editor-in-Chief 


Wondering where 
to get hack issues? 

'MICRQ/ 

Members: You pay only $7.50 per copy for 1984 to 1987 
issues and $10.00 per copy for 1988 issues. 

Send prepaid orders to Customer Service, 

IEEE Computer Society, 10662 Los Vaqueros Circle, 
Los Alamitos, California 90720 


4 IEEE MICRO 






Department 


From the 
Editor-in-Chief 


I get letters... 


T his year I have been impressed with 
how much mail has crossed my 
desk. I am sure that in previous 
years I have had just as much, but for 
some reason the mail seems to be out of 
hand this year. I very informally sur¬ 
veyed the mail that I received by saving 
two weeks of mail that came to me dur¬ 
ing the first part of March. In this two- 
week period I found that I’d received 88 
ads in the form of letters, flyers, and 
brochures; 23 newspaper publications; 
six plastic-bagged cards; and 36 
magazines, five of which were IEEE 
publications. I did not count local and 
job-related memos or the mail that I 
receive at home. 

If my mail is typical, a substantial 
amount of material is being produced 
for people to read and absorb. The real 
question is how publications like IEEE 
Micro stay in business and hold their 
market share. 

Well, Micro has several things going 
for it that make it a quality publication, 
and I think—everything else being 
equal—people will naturally migrate to 
a quality product. Every article that 
appears in the publication is reviewed 
by volunteers in the Computer Society. 

I send new manuscripts to one of the 
Micro editorial board members, who 
then selects three peer reviewers. These 
reviewers formally assess each manu¬ 
script for its technical content, original¬ 
ity of ideas, clarity of thought, and 
reader interest. They may suggest addi¬ 
tions, deletions, major rewriting, or, as 


happens in a surprising number of cases, 
outright rejection. Clearly, the editorial 
board upholds Micro's high standards 
and exerts much influence over the tech¬ 
nical content of each issue. 

The results of each review are given 
to me, and I in turn inform the author of 
the results of the review. After an author 
has satisfied the reviewers’ recommen¬ 
dations, the manuscript is ready for 
publication and earmarked for either a 
particular theme issue or the first-avail- 
able opening as a special feature. Every 
accepted paper next passes through the 
editing process in the Computer 
Society’s Los Alamitos office. Either 
Marie English or Christine Miller works 
with the authors to produce a clear, 
articulate article that can be understood 
by both experts and novices in the 
industry. Readability is a top concern 
throughout this process. The quality of 
the editing is apparent in the publica¬ 
tion, but it is really appreciated by the 
authors as well. Marie and Christine 
have received excellent reviews from 
our authors. 

The whole manuscript process bene¬ 
fits readers. Personally, I tend to read 
material that is current and that has had 
some editing and reviewing. I find that 
the editing process tends to filter out the 
unnecessary and eliminate some of the 
trivial. I suspect that most readers feel 
the same. 

Micro's departments have also been 
well received by a great number of read¬ 
ers. These departments play an impor¬ 



tant role in dispensing industry knowl¬ 
edge and are a tribute to the board mem¬ 
bers who generate them. Readers have 
indicated the popularity of such depart¬ 
ments as New Products, Micro Review, 
Micro World, Micro Law, Micro View, 
and Micro News. All of these depart¬ 
ments are edited. 

Lastly, Micro brings to its readers 
international news from Europe and the 
Far East in the form of two special 
issues each year from these regions. 

The current issue is edited by Ken 
Sakamura (University of Tokyo) and is 
one of the special issues reflecting on¬ 
going work in the Far East. Ken is the 
father of the TRON concept as well as 
being an IEEE Micro editorial board 
member. Micro , as you may recall, was 
the leader in the introduction of the 
TRON concept to the United States 
reader. The project is interesting in that 
it has the potential to serve an interna¬ 
tional as well as a regional market. It is 
increasingly clear that in order to serve 
a large regional market efficiently, it is 
necessary to also serve a wider interna¬ 
tional market. A company can attempt to 
do this in two ways, by default or by 
design. TRON is an attempt to serve a 
world market by design. If TRON does 
succeed in the international market, the 
impact on the computer market in the 
United States will be substantial. It will 
be worth watching the development of 
the world market to see how TRON and 
TRON-related products fare. 

In addition to the Special Far East ar- 


June 1989 5 







From the Editor-in-Chief 


tides selected by Ken Sakamura, you 
will find an article by Jean-Daniel 
Nicoud and Andrew Tyrrell on the 
Transputer 414 instruction set. The 
transputer is a unique device that has 
changed, to some degree, the way we 
view parallel-computing hardware. 


In the mailbag 

October 1988 

“I would like to see articles on ad¬ 
vanced inventory control, personnel, 
and payrolls when some 400 vari¬ 
ables are to be managed.” M.K.D.A., 
Mosul, Iraq (Several journals and 
magazines evaluate and discuss the 
various aspects of business software, 
for example, Lotus, PC Magazine, 
and PC World. IEEE Micro does not 
see these topics as areas that are of 
interest to most of its readers.—J.H.) 


December 1988 

“I liked everything; it seems I 
found the right engine to power our 
graphics system. DSPs are my dis¬ 
covery of the month. I would like to 
see more on modern CPUs including 
RISCs and evaluation of performance 
of CPUs and systems. . . .” G.M., 
Warsaw, Poland (I am pleased that 
you found the special issue on signal 
processing of interest. The DSP issue 
is one of the most popular issues of 
Micro. —J.H.) 

“I liked the article on the 
TMS320C30 floating-point digital 
signal processor.” J.O.S., Makati, 
Philippines 

“I liked the articles on DSP.” 
A.P.T.G, Christchurch, New Zealand 

“I liked the TMS320C30 floating¬ 
point DSP article. I liked the article 
on a PC-based digital speech spectro¬ 
graph.” A.N., Isfahan, Iran 

“I liked the DSP reviews. The 
benchmarks for all the DSPs ought to 
have been the same, which would 
have made comparison easier. The 
benchmark table for the DSP96002 is 
very good. I would like to see prices 
and ordering information for the re¬ 


Here’s a chance to look at the instruc¬ 
tion set of this most interesting device. 
A second article written by Mehdi 
Owrang and Gemini Gunaratna dis¬ 
cusses a logical design tool for analyz¬ 
ing data on a logical as well as a con¬ 
ceptual level. The concepts presented in 


viewed products (DSPs for example). 
This is very useful for foreign coun¬ 
tries (like India).” S.N., Pune, India. 
(Until manufacturers agree on com¬ 
mon definitions that characterize 
their hardware and software and 
these definitions are universally ac¬ 
cepted, prediction of device perform¬ 
ance when compared with other 
devices is at best a risky business. As 
things stand now, it is not possible 
for us to edit tables of data to a stan¬ 
dard.—J.H.) 

“I liked the whole issue, especially 
the articles on the TMS320C30 and 
the DSP96002. I would like to see 
more about Motorola’s new DSP 
chips.” C.R.G., Acassuso, Argentina 

“I liked the whole issue. I would 
like to see even more on DSP hard¬ 
ware and DSP algorithms.” S.G., 
Newcastle, England 

“I would like to receive more in¬ 
formation on the subject on page 68 
(‘A PC-based Digital Speech Spec¬ 
trograph’) that includes the Radix 
FFT in the TMS32010 Assembly 
Language.” F.H.T., Baghdad, Iraq 
(I have sent your request to the 
author.—J.H.) 

“I would like to see single-chip 
microcomputer applications, low- 
power applications, the Motorola 
68HC11 for example.” A.K., Cal¬ 
gary, Alberta 

“I would like to see more on CNC, 
process control/simulation, and fac¬ 
tory automation.” A.B., Yverdon, 
Switzerland 

“I thoroughly enjoyed the DSP is¬ 
sue for updating on current processor 
status and architecture. There are al¬ 
ways very diverse topics presented 
(in Micro), which is good for broad¬ 


this article allow the design of large re¬ 
lational database systems. 



ening one’s background. Keep up the 
good work.” G.L.S.C., Portland, ME 
“I liked the entire issue and I 
would like to see more.” K.S., Ne¬ 
pean, Ontario 


February 1989 

“I liked the magazine. I only wish 
I had time to read it from cover to 
cover.” T.B., Victoria, British Co¬ 
lumbia (I am pleased that you take 
the time to read at least part of Mi¬ 
cro.—EH.) 

“I liked the real-time implementa¬ 
tion of the Newton-Euler equations 
of motion on the NEC (J.PD77230 
DSP.” M.J.T., Highland, MI 

“Thanks for a job well done, Jim; 
Joe, welcome! I would like to see au¬ 
thors’ full addresses with articles. I 
would like to correspond sometimes. 

. . A.D.W., Lewisburg, PA 

“Nice wording on the lead-in to 
Micro View interview! Top quality; 
nice job, Marie and Christine.” J.S., 
Phoenix, AZ (High praise indeed 
when coming from our magazine’s 
first managing editor. Thank you. — 
J.H.) 

“I disliked ‘The RISC Approach.’ 

It did not include the 286/386 while 
including the Z8000, which makes 
the article suspect.” D.F., Fort 
Wayne, IN (See Letters to the Editor 
in this issue. — J.H.) 

“I liked the RISC approach; good 
overview of approach. The inability 
of CISCs to expose microcode se¬ 
quences for compiler optimizations is 
an often-overlooked point.” J.B.H., 
San Jose, CA (See Letters to the Edi¬ 
tor.—J.H.) 


6 IEEE MICRO 







Department 



Micro World 


Hubert Kirrmann 

Asea Brown Boveri Research Center 
CRBC.l 

CH-5405 Baden, Switzerland 


Neural computing: The new gold rush in informatics 


I n some abandoned mine, one stubborn 
miner hits pay dirt, and a new gold 
rush starts. This miner will become 
rich, but many others will never do so. 
What happened recently to the aban¬ 
doned technology of LOX (liquid oxygen 
temperature) superconductivity currently 
repeats itself with neural computing. The 
neural computing technique, buried 20 
years ago after seemingly unimpeachable 
proof of its inability, emerges now as a 
promising new direction in information 
technology, or informatics. 

Basically, neural networks mimic the 
structure of the neural system of ani¬ 
mals. Elements with multiple inputs, 
which change state (fire) when the sum 
of their inputs exceeds a certain thresh¬ 
old, emulate the system’s neurons. They 
can be implemented by an analog sum¬ 
ming amplifier with a threshold func¬ 
tion, the inputs being connected over 
resistors. 

Neurons are organized in connected 
layers; the output of each neuron of a 
layer connects to the input of each 
neuron of the next layer. (Some net¬ 
works also connect neurons within the 
same layer; others consider symmetrical 
connections.) The number of connec¬ 
tions rises of course with the square of 
the number of neurons per layer. 

The strength of each connection is 
assigned a weight. These weights may 
be modified to give the network certain 
capabilities, for example, to recognize 
patterns applied at the input layer. Con- 
trarily to classical computers, which are 
programmed, neural networks must be 
taught; they require a learning phase. 
Numerous different algorithms of learn¬ 
ing exist, which dictate how the weights 
are modified in response to examples 


presented to the input layer of the net¬ 
work. Though these structures have 
been known for a long time, the break¬ 
through came from new learning algo¬ 
rithms that allowed us to teach interme¬ 
diate (hidden) layers of neurons. 

The first conferences on neural net¬ 
works attracted an incredible number of 
believers in the new branch of progress. 
Proceedings of more than 2,500 pages 
pile up. In Europe two conferences, 
Neural Networks from Models to Appli¬ 
cations (IDSET, Paris, June 1988) and 
Connectionism in Perspective (Zurich, 
October 1988), were very well attended, 
especially with respect to conferences 
on the classical computing fields. 

Informaticians and engineers are only 
a minority of the participants. Others 
represent domains as far apart as zool¬ 
ogy, drug-addiction psychology, brain 
research, thermodynamics, theoretical 
physics, mathematical statistics, and 
artificial intelligence. Expectations are 
high, confusion reigns, and concrete 
results are few. 

Basically, we are still at tutorial time. 
Tutorials on neural networks used to be¬ 
gin with a description of the biological 
nervous system—which makes any 
neurobiologist unhappy. It is not just a 
matter of professional jealousy, but the 
fact that neuro-informaticians tend to 
fall in love with their biological model, 
although the methods they use often 
have no counterpart in nature. This fact 
is especially true for the very hypothesis 
that allowed the breakthrough in neural 
computing, namely that neuron connec¬ 
tions are bidirectional and symmetrical. 
Nevertheless, it does not prevent neuro- 
informaticians from using biological 
terms with profusion, like “rewarding 


the network” or “killing neurons.” Per¬ 
sonification of computing systems 
often becomes a sign of exaggerated 
expectations. 

Just as with rule-based artificial intel¬ 
ligence, the technological wave will 
settle down, and neural networks will 
complement classical computing and 
rule-based experts in some domains, 
rather than replace them. The domains 
of choice for neural networks are those 
in which rules are difficult to establish, 
either because they are unknown or too 
complex. The task of the neural network 
is to find rules and discover patterns in 
seemingly unorganized data. Pattern rec¬ 
ognition for voice, image, or signals is 
already a mainstream technology. 

Neural networks will also help find 
rules in complex data sets, which will be 
used as a base for other algorithms. A 
special domain is adaptive process con¬ 
trol, which is in fact the only branch of 
informatics that never gave up the prin¬ 
ciple of analog computation. The back¬ 
tracking algorithm, for instance, origi¬ 
nated in adaptive control. 

Current examples of neural networks 
are still small in scale. They face the 
same questions expert systems once 
faced: How do they scale when the 
problem becomes complex enough to be 
interesting? Can one really build a net¬ 
work of interesting size? How long does 
it take to program a network? Can learn¬ 
ing time be reduced by mapping weights 
from another network, or must each 
network be taught independently? 

Let’s divide the domain of research 
into an algorithmic part, one that repre¬ 
sents knowledge, another that simulates 
networks, and a final one that realizes 
networks. 


0272-1732/89/0600-0007501.00 © 1989 IEEE 


June 1989 7 












Micro World 


• In the algorithmic part, we get most 
ideas from theoreticians and mathemati¬ 
cians. Theoreticians closer to the world 
of thermodynamics than to informatics 
rediscovered neural networks. 

• The representation of knowledge, 
that is the mapping of the problem and 
its solution to the network, is closely 
related to the application. Here, many 
disciplines feel at home; everybody 
participates. 

• Simulation of networks on classical 
computers is rather trivial, with most of 
the work occurring on the user interface. 
Numerous universities and research 
departments work on teaching aids. 
Simulators with a small number of neu¬ 
rons work fast enough to be used in real 
time (which makes them already a reali¬ 
zation). The fact that sequential ma¬ 
chines can simulate neurons makes it 
clear that parallelism and algorithms are 
separate issues. Some simulators are 
made on parallel processors, for in¬ 


stance on transputers, reflecting thus the 
parallelism of neural networks. The per¬ 
formance gain is not evident if one con¬ 
siders that most of the computing power 
is spent in communication. 

• The realization of large neural net¬ 
works in silicon brings tremendous chal¬ 
lenges to hardware designers. While the 
neuron itself is rather simple, wiring is 
the problem. Nature knows ways to bus 
10,000 signals to a single neuron, but 
we are unable to connect more than a 
hundred connections to a bus. The same 
problems appear today in classical radar 
signal processing, which requires the 
interconnection of some 20,000 
processors. An alternative proposal 
could be optical neural networks. 

In Europe, activities are numerous 
and some interesting results have al¬ 
ready been achieved. The best-known 
work is possibly T. Kohonen’s speech- 
recognition network based on the map¬ 
ping principle developed at the Helsinki 


Institute of Technology. The European 
Community sponsors the BRAIN (Basic 
Research in Adaptive Intelligence and 
Neural Computing) and ESPRIT proj¬ 
ects. The governments finance national 
projects, and the universities open fac¬ 
ulty positions for neurocomputing. 

Since it is still too early to describe 
works that most of the time are just in 
their infancy, I list only the research ac¬ 
tivities that may be of interest to the 
readers of IEEE Micro. I give only one 
contact person, and I’m sure I have 
omitted many works; please protest. 
Contacts between these institutes and 
readers of this column will help build a 
meganeuron network by increasing the 
reciprocal weights. Since there are too 
many to list, I depend on readers to 
locate addresses for contacts of interest. 

I would like to thank J. Bernasconi of 
the Asea Brown Boveri Research Center 
who helped me explore the field of 
neural computing and prepare this 
discussion. 


Neural Network Research Activities 

Institute/Contact 

Research specialty 

Institute/Contact 

Research specialty 

Denmark 

Nordita, Copenhagen 

Statistical learning 

Ecole Nationale 

Associative memories 

J.A. Hertz 

Image processing 

Superieure 

Paris 

Learning 

England 


J.P Nadal 


University of 
Edinburgh 

D. Wallace 

Statistical physical methods 
Associative memories 

Parallel processors 

CEN-Saclay 

Gyf-sur Yvette 

B. Derrida 

Associative memories 

Imperial College 

I .ondon 

Boolean neural networks 

University of Paris V 

Feedforward networks 

I. Alexander 


F. Fogelman 

Backpropagation learning 
Medical diagnosis 

Royal Signals and 
Radar Establishment 
Malvern 

D.G. Bounds 

Diagnosis 

Optimization 

Multilayer topological 
maps 

Ecole Polytechnique 
Grenoble 

C. Jutten 

Signal procesing 

Associative memories 

Finland 


Germany 


Helsinki University 
of Technology 

Topology-conserving maps 
Content-addressable 

Technical University 
Munich 

H. Ritter 

Topology-conserving maps 
Motor task learning 

T. Kohonen 

memories 



Speech recognition 

University of Giessen 

Dynamics of learning 

France 

E.S.P.C.I. 

Associative memories 

M. Opper 

Pattern recognition 

Paris 

Pattern recognition 

University of 

Associative memories 

G. Toulouse 

Spin-glass concepts 

Heidelberg 

Pattern recognition 


Optimization 

J.L. von Hemmen 

Continued onp. 94 


8 IEEE MICRO 








Department 


Micro Review 



Richard Mateosian 
2919 Forest Avenue 
Berkeley , CA 94705-1310 
(415) 540-7745 


The USA in todays world 


At the end of World War II the USA's 
newly awakened economy stood in 
sharp contrast to the war-ravaged econo¬ 
mies of the rest of the industrialized 
world. Military and industrial suprem¬ 
acy came easily. Perhaps complacency 
and laxity crept in. Self-examination 
and long-range economic, social, and 
environmental planning were not central 
to the country’s life. Now, more than 40 
years later the chickens have come 
home to roost. 

Two recent books address the present 
American situation in positive ways. 

One of these, by Robert Mclvor of 
Motorola, focuses specifically on what 
the semiconductor industry needs to do 
to compete in the world economy. The 
other, by Frances Moore Lappe of the 
Institute for Food and Development 
Policy, sees economic well-being aris¬ 
ing from an active and open discussion 
of America’s underlying assumptions 
and values. Interestingly, these two 
works, coming from strikingly different 
perspectives, lead us to confront the 
same basic issues. 


Managing for Profit in the 
Semiconductor Industry , Robert 
Mclvor (Prentice Hall, Englewood 
Cliffs, N.J., 1989, 544 pp.; $30) 

If you believe, as I do, that Motorola 
has become a company capable of com¬ 
peting in the world semiconductor 
marketplace, you should be interested 
in Robert Mclvor’s account of how 
Motorola got that way and what it’s 
doing to stay competitive. 

Mclvor ridicules the notion that we 
are simply moving into a post-industrial 


service economy in which the best jobs 
stay here, while low-level manufactur¬ 
ing functions migrate across the Pacific 
Ocean. He laughs at the idea that there 
is refuge in “niche markets” like ASICs. 
His view is that in a highly capital- 
intensive industry like semiconductor 
manufacturing, market share is a pre¬ 
requisite to profitability. While every¬ 
one knows that market share depends on 
product utility, on-time delivery, price, 
and service, the successful Asian 
companies seem to have learned how to 
achieve these goals. American compa¬ 
nies have some catching up to do. 

According to Mclvor the key to suc¬ 
cess is “world-class factories,” by which 
he means an integrated process that 
encompasses manufacturing, marketing, 
and engineering. His book deals with 
that process: 

Its theme is transition . . . from econo¬ 
mies of scale to economies of scope, from 
standardization and cost reduction to mix 
management and revenue maximization, 
from labor-intensive production systems 
to knowledge-intensive ones, from inven¬ 
tory-profligate functional manufacturing 
systems to just-in-time and short-cycle 
manufacturing methods, from inspecting- 
out-rejects quality systems to building in 
quality using real-time-feedback process 
control, from offshore escape to onshore 
renewal using the most contemporary 
manufacturing techniques. . . . 

This covers pretty well the manage¬ 
ment mechanics aspects of the book, 
which comprise the bulk of the pages, 
but Mclvor still has important things to 
say about people and attitudes. He sees 
the importance of motivation—not just 
the often negative motivation of “man¬ 

0272- 1732/89/0600-0009$01.00 © 1989 IEEE 


agement” in the sense of control, 
coercion, pushing from behind, prescrib¬ 
ing in detail, and so on, but the positive 
motivation of leadership. Mclvor sees 
leadership as appealing to positive 
qualities: the need to accomplish and 
be appreciated, the need to belong to a 
group and contribute to a common 
enterprise. 

Other people have said many of the 
same things. Often, however, the unspo¬ 
ken message that comes from watching 
to see who succeeds and who fails in an 
organization is different from the 
spoken message. Putting into practice 
the principles of motivation and leader¬ 
ship that Mclvor advocates means creat¬ 
ing a corporate culture within which ev¬ 
eryone feels them as realities. There’s 
no formula for how to do this, but I 
think that open and continuing discus¬ 
sion of fundamental beliefs and values 
is a prerequisite. That’s why the next 
book is so interesting. 


Rediscovering America’s Values , 
Frances Moore Lappe (Ballantine, New 
York, 1989, 349 pp.; $22.50) 

The Lincoln-Douglas senatorial 
debates of 1858 were wide-ranging af¬ 
fairs that went on for hours, but people 
followed them intently. According to the 
Beards’ Basic History of the United 
States’. 

To the discussions men and women 
had flocked on horseback, in farm wag¬ 
ons, in carriages and on foot. . . . They 
followed the arguments of the debaters 
and weighed the clashing opinions so¬ 
berly, with due recognition of their sig- 

June 1989 9 
























Micro Review 


nificance. At the same time newspapers 
had carried far and wide full reports of the 
debates; and citizens all the way from 
Maine to California could make up their 
minds on the merits of the arguments and 
the plans for meeting the impending 
crisis. 

What Frances Moore Lappe has tried 
to do, in the absence of substantive dis¬ 
cussions of issues by present-day public 
figures, is to concoct a Lincoln-Douglas 
debate for our own time. Of course, 
since she has provided both sides of this 
debate, and since she clearly identifies 
with one of them, the obvious question 
of impartiality arises. However, what 
struck me in reading this book was how 
well matched the two sides were. There 
are no knockout punches in this contest. 

The issues in this debate are of a 
general nature, but they go to the heart 
of the problem that Robert Mclvor 
addresses from his perspective as a 
semiconductor executive. Are economic 
health and social justice related? Are 
they mutually exclusive, mutually rein¬ 
forcing, or something in between? How 
do you define and measure either of 
these conditions? How do they relate to 
freedom? 

As there are few one-liners or sound 
bites in this dialogue, I’m not going to 
try to give you any highlights. Go get 
the book and read it. 


Miscellany 

Dictionary of Computer Terms , 3rd 

ed. (Webster’s New World, New York, 
1988,426 pp.; $6.95) 

I don’t know why I bother reviewing 
books like this, but some things irk me 
so much that I can’t resist. None of the 
definitions in this book struck me as 
egregiously bad. On the other hand, 
none that I can recall seeing as I 
browsed through it had any depth or 
gave me any new insights. Some were, 
to put it kindly, a little flaky. For 
example: 

Ada. It is named for Ada Augusta 
Lovelace, the first female programmer.... 

Add-in. Refers to a component that can 
be added to a circuit board already in¬ 
stalled in a computer. . . . 

Blanking. On a display screen, not dis¬ 
playing a character or leaving a space. 

Chain. (1) Linking of records by means 
of pointers in such a way that all records 


are connected, the last record pointing to 
the first. . . . 

Queue. Queues are nothing more than 
the waiting lines that have become an 
accepted and often frustrating part of 
modern life. . . . 

There is a sample program occurring 
in the definition of Basic. It contains the 
following line: 

170 REM ** PI - PRINCIPLE FOR 
THE PRESENT MONTH ** 

Will someone please send the 
Webster’s New World folks a dictionary 
of English. 


HyperDictionary , Philip J. Brown 
(Van Nostrand Reinhold, New York, 
1988, 220 pp.; $19.95) 

After waxing enthusiastic about Hy¬ 
perCard, Brown tells us in his preface: 
“The only thing missing has been a 
complete and comprehensive guide to 
the Hypertalk language. . . . Here it is.” 

Well, no it isn’t. This book is a credit¬ 
able and useful piece of work, but it has 
a few shortcomings. For example. 

Brown shrugs off the possibility of 
interfacing with compiled code as “not 
something most users will want to do,” 
and fails to deal with it at all, let alone 
completely and comprehensively. 

Another mistake he makes is the 
common one of failing to distinguish 
between parameters and arguments. His 
definition of argument is: “Used in 
some computing books for mean 
parameter.” 

Starting from this point, he now tries 
to explain how to define functions, and 
the reader is left wondering whether the 
function’s parameters are global vari¬ 
ables, like Basic subroutine arguments, 
or are limited in scope to the body of the 
function. 

Another place where Brown leaves 
the reader wondering is in his attempts 
to describe arithmetic. He says several 
times that HyperCard does arithmetic by 
using the SANE package, and if he 
would just leave it at that he’d be on 
safe ground. Unfortunately, he doesn’t. 
For example, here are excerpts from his 
entry, number storage : 

HyperCard stores everything as strings 
of characters, even numbers. . . . When it 
has to perform numerical operations, 
HyperCard first converts the strings into 
its internal representation of numbers. If 
the result is . . . stored in a variable, it 
will retain the internal representation, and 
so its full accuracy. The internal represen¬ 


tation defaults to 6 places of decimals be¬ 
cause the default value of numberFormat 
is ‘0.######.’ 

Either HyperCard’s behavior is bi¬ 
zarre, or Brown has confused internal 
representation with printing format. 
Similarly, he identifies specific decimal 
representations of e and pi as being built 
into HyperCard, although one assumes 
that HyperCard would identify these 
constants to SANE by name and use its 
values for them. 

I don’t want to keep carping about 
these fine points. This is not a bad book, 
but it doesn’t live up to its promise. The 
Hypertalk documentation void remains 
unfilled. This is a job for Apple, and 
Apple ought to do it. 


Digital Filters , 3rd ed., R.W. 
Hamming (Prentice Hall, Englewood 
Cliffs, N.J., 1989, 300 pp.; $48) 

In his preface Hamming says: 

Digital filtering includes the processes 
of smoothing, predicting, differentiating, 
integrating, separation of signals, and 
removal of noise from a signal. Thus 
many people who do such things are actu¬ 
ally using digital filters without realizing 
that they are; being unacquainted with the 
theory, they neither understand what they 
have done nor the possibilities of what 
they might have done. Computer people 
very often find themselves involved in 
filtering signals when they have had no 
appropriate training at all. Their needs are 
especially catered to in this book. 

One way Hamming caters to the needs 
of “computer people” is by assuming 
only calculus and a little statistics, 
which he reviews. He assumes no 
background knowledge of electrical 
engineering. Of course, he develops 
additional mathematics as he needs it— 
the subject is basically mathematical. 

He presents the mathematical arguments 
clearly. The formulas are competently 
typeset and clearly displayed. The 
figures are clear, readable, and well 
integrated with the text. 

In his acknowledgments Hamming 
thanks the Naval Postgraduate School 
(Monterey, Calif.) “for providing an 
atmosphere suitable for thinking deeply 
about the problems of teaching.” He 
seems to have made good use of that 
opportunity. If you need to learn about 
digital signal processing (doesn’t every¬ 
one?), and you’ve been put off by the 
formidable appearance of the books 
you’ve seen on the subject, get this one. 


10 IEEE MICRO 



For the Record, Carol Pladsen and 
Ralph Warner (Nolo Press, Berkeley, 
Calif., 1989, 295 pp. plus diskette; 
$49.95) 

For the Record is a book and a pro¬ 
gram (PC and Macintosh versions are 
available). The program is the focal 
item, but the book is of more importance 
and higher quality than most program 
manuals. In fact, the program and book 
are an outgrowth of an earlier book that 
provided forms to support a manual per¬ 
sonal record-keeping system. 

Nolo Press specializes in self-help 
law books. In this case the legal angle is 
estate planning, and a major objective of 
the record keeping automated by the 
prograiryis to allow your affairs to be 
understood and managed in the event of 
your incapacity or death. Of course, 
well-kept records may certainly be 
useful to you even if you are not 
incapacitated. 

The great virtue of the record-keeping 
program is its exhaustive nature. You 
begin in and continually return to a 
menu of 27 main categories of records. - ' 
Selecting any of these opens a further 
menu of subcategories, and selecting a 
subcategory takes you to a screen de¬ 
signed for that subcategory. Each screen 
marks the start of an entry for that sub¬ 
category. Some subcategories have 
single-screen entries, while others con¬ 
tinue over several screens. You can 
open as many entries as you like in each 
subcategory and move through a linked 
list of them using forward and backward 
buttons at the bottom of the screen. Any 
category can be assigned a password 
and locked, a precaution which, the 
authors admit, can be circumvented by 
a dedicated hacker. 

Each entry is designed with fixed 
fields into which you enter the specified 
information—serial numbers, dates, 
descriptions, locations, or whatever. The 
fields have fixed upper limits on their 
sizes, but additional information can be 
appended by opening a notes window 
for the entry. 

The largest part of the book, called 
“Legal and Practical Information about 
Your Records,” parallels the main menu 
of the program. The authors discuss 
each category in a general way, with 
minimal direct reference to the entries 
and fields implemented in the program. 
Another part of the book is a 35-page 
treatise entitled “Estate Planning 
Basics,” which is even less directly 
related to the program. This kind of 
information is available from many 


sources. Reviewing it doesn’t really fall 
within my area of expertise, but it 
seemed clear and well written to me. 

The remaining parts of the book—no 
more than 20 percent in all—are “How 
to Use For the Record the actual man¬ 
ual for the program, which comes in two 
flavors—PC and Macintosh; and “Refer¬ 
ence,” a glossary, subcategory index, 
and extremely brief troubleshooting 
guide. 


I spent several hours entering my own 
personal records and am far from 
finished. I have a variety of small 
quibbles about the design of the user in¬ 
terface and the editing and publishing of 
the book, but they’re not worth mention¬ 
ing. I like this program, and I think I’ll 
finish compiling my personal records 
someday, if I can find the time. 


Reader Interest Survey 

Indicate your interest in this department by circling the appropriate number 
on the Reader Service Card. 

Low 177 Medium 178 High 179 


i MOVING? ] 


NAME (PLEASE PRINT) 


NEW ADDRESS 


CITY 


STATE/COUNTRY, ZIP 

• Address changes: Please notify us 4 weeks in advance. 1 

• This address change notice will apply to all IEEE publica¬ 
tions to which you subscribe. 

• List new address above. , 

• If you have a question about your subscription, place label 
here and clip this form to your letter. 

• Mail to: IEEE Micro , Circulation Department, 10662 Los 

Vaqueros Circle, Los Alamitos, CA 90720-2578. | 

I_l 


June 1989 11 












Computer Projects 
in Japan 


♦ ■ ♦ 


■ ♦ 


♦ ■ 


■ ♦ 


Ken Sakamura 
University of Tokyo 



T o learn what is going on in another country is often difficult. You 
may come across some articles with a few statistics or some 
comments about the country in which you are interested. But 
putting these pieces together to form a comprehensive picture of the scene 
in a foreign country is no easy task. I hope the articles assembled in this issue 
of IEEE Micro help readers acquaint themselves with the Japanese com¬ 
puter scene. 

Since the articles in this issue can cover only a tiny portion of the 
computer activities, I summarize here very briefly three of Japan’s ongoing 
computer projects, all major projects of particular interest to Micro readers. 
The Fifth Generation Computer project develops artificial intelligence 
systems, the Sigma project strives to increase the software productivity of 
the Japanese computer software industry, and the TRON project, the one in 
which I am deeply involved, aims at establishing a computer system 
architecture. 

The Fifth Generation Computer and Sigma projects have strong financial 
backing from Japanese government agencies. The TRON project, on the 
other hand, receives support from 130 commercial organizations that 
include European, American, and Japanese companies. 

The 10-year Fifth Generation Computer project is now into its eighth year 
of operation. Last year (1988) saw the demonstration of Multi-PSI, a cluster 
of 64 PSI-II sequential inference machines. Using these inference ma¬ 
chines, project designers developed KL1, a parallel language; PIMOS, a 
multiprocessor operating system; and PIM (Parallel Inference Machine), 
which runs Prolog rapidly and will eventually contain 100 processors. In the 
last three years of the project, the stated goal is to build a prototype of PIM 
that contains 1,000 processors. 

The Sigma project, started in 1985 and extending to 1989, aims at making 
a standard software development environment available to computer soft¬ 
ware developers. Sigma-OS (a variant of Unix), a group of software 
productivity tools called the Sigma tools, and Sigma workstations form the 
basis of this environment. Plans call for setting up a centralized depository 
of topics on software productivity tools known as the Sigma center. This 
center will disseminate the latest information to Sigma workstations con¬ 
nected to it, thus promoting software productivity by providing the latest 
tools and recycling reusable components. In 1990 we expect to see Sigma- 
OS version 1 and the full use of Sigma workstations and tools. 


12 IEEE MICRO 


0272-1732/89/0600-0012$01.00 © 1989 IEEE 




The TRON project’s goal is to produce an HFDS, or 
Highly Functionally Distributed System, in which a 
very large number of computer objects are connected. 
The number, on the order of millions and more, far 
exceeds the number of processors in existing computer 
networks. The network is heterogeneous. Currently, 
project participants work at designing the components 
of the computer architecture to make the HFDS a 
reality. 

The readers of Micro are aware that I introduced 
articles about the TRON project in 1987 and in 1988. 
This latest update on the project should mention that six 
companies now produce the VLSI CPU family of chips 
based on the TRON specification. The Gmicro/200 and 
TX1 CPUs are commercially available, and samples of 
the floating-point processor unit, cache controller, and 
DMA controller to be used with the CPUs have been 
released. Various sources, including US companies 
such as Microtec Research, Inc. in Santa Clara and 
Ready Systems Corp. in Sunnyvale, California, offer 
the software tools to support software development for 
the CPU family. Designers are also working on a 
general-purpose system bus called Tobus/Toxbus. 

The industrial version of the TRON operating sys¬ 
tem, ITRON, now includes a smaller specification 
called Micro-ITRON that is targeted to single-chip 
CPU application. Micro-ITRON will be used in high- 
end home appliances and will be important in develop¬ 
ing electronics goods that connect to the HFDS envi¬ 
ronment in the future. Also in design is the ITRON2 (an 
interim name) operating system for the VLSI CPU in 
the TRON project. 

Designers working with BTRON, the business ver¬ 
sion of TRON, have produced some prototypes for 
software development. BTRON incorporates multime¬ 
dia capability, as you can see from one of the articles in 
this issue. Samples of such machines will become 
commercially available in Japan through special out¬ 
lets such as third-party software publishers or system 
houses interested in building new BTRON applica¬ 
tions. 

The network operating system tying these pieces 
together is called MTRON, or Macro TRON. MTRON 
researchers have formed and activated TRON com¬ 
puter housing, computer building, and urban develop¬ 
ment projects. The projects include construction com¬ 
panies and furniture manufacturers in their member¬ 
ship. The inclusion of these participants will ensure 
that the HFDS can really be built on an experimental 
basis. Their activities will bring new insights to future 
TRON design activities. 

The articles from Japan in this issue contain a discus¬ 
sion of a data-driven VLSI processor for consumer 
electronics developed by Mitsubishi Electric, Sharp, 
and the Osaka University team. Hitachi’s article con¬ 
cerns a numerical processor developed for the TRON 


project that should also be of interest to workstation 
designers. The BTRON/286 article describes the first 
implementation of the BTRON architecture on 286- 
based computers. 

I hope the articles in this issue give you some idea of 
the current development activities in Japan. 



Ken Sakamura is an associate professor in the Department of 
Information Science at the University of Tokyo. He initiated 
the TRON project in 1984. Under his leadership, several 
universities and over 100 manufacturers now participate in 
the project to help build computers in the 1990’s. In addition 
to his involvement with TRON, Sakamura chairs several 
committees of the Japan Electronics Industry Development 
Association and the Information Processing Society of Japan. 
He has written numerous technical papers and books and 
received the BS, ME, and PhD degrees in electrical engineer¬ 
ing from Keio University in Yokohama. He is a member of the 
IEEE and the IEEE Computer Society. 

Questions concerning this article can be directed to Ken 
Sakamura, Department of Information Science, Faculty of 
Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, 
Tokyo 113, Japan. 


Reader Interest Survey 

Indicate your interest in this article by circling the 
appropriate number on the Reader Service Card. 

Low 150 Medium 151 High 152 


June 1989 13 










An Overview of the 
BTRON/286 Specification 



While individu¬ 
als may find 
the 32-/64-bit A 
TRON Project's 
systems too 
costly, this 16-bit 
version should 
be more 
affordable and 
fill business data 
and graphics 
needs. 

/ 

V 

Ken Sakamura 
University of Tokyo 

Yoshiaki Kushiki 
Matsushita Electric 
Industrial Co., Ltd. 

Kazuhiro Oda 
Toshiba Corporation 


TRON, or Business TRON, is a set of operating system specifica¬ 
tions for workstations and personal computers that employ bitmap¬ 
ped displays. The primary goals set for BTRON call for the 
realization of easy operation through a standardized human-machine inter¬ 
face (HMI) and the assurance of data compatibility across different appli¬ 
cations and machines. Second, BTRON establishes a method for exchang¬ 
ing data and control across applications. Third, BTRON realizes a document 
management system in which text and graphics can be mixed freely and 
layouts handled with ease. And fourth, it offers standard support of English, 
Japanese, and other languages, providing a wide variety of fonts. 

The BTRON specification incorporates an innovative information man¬ 
agement model known as the real!virtual object model? Other new concepts 
include BTRON’s use of tags called fusen to define relations between 
applications and data. HMI specifications extend from keyboard and point¬ 
ing-device design to the movement of objects on the screen and basic editors 
for both text and graphics. The standardized HMI, data compatibility, and 
other aspects take form in a consistent design approach followed throughout 
the entire TRON-project architecture. 

The BTRON specification achieves maximum efficiency when it is 
implemented on a family of processors designed as part of the TRON 
project: the VLSI CPU. The clearly hierarchical operating system design 
also assures ease of implementation when the nucleus is realized on other 
processors. Here we explain some details about this nucleus and the 
extended nucleus of the BTRON specification. We concentrate on the 
BTRON/286 specification, which is suitable for currently available 16-bit 
microprocessors. (See the BTRON/286 hardware and video image handling 
boxes on pp. 16 and 17.) Additional features, not found in the BTRON/286, 
will be added to BTRON/TRON VLSI CPU specification. Refer to 
Sakamura 1 ' 3 for a general overview of BTRON/286 plans. 


Software configuration 

The BTRON specification defines a single-user, multiprocess operating 
system designed for a superior human-machine interface. It consists of the 
following three broad levels: 

• the operating system nucleus, 

• the extended nucleus, and 

• system applications. 

Figure 1 on p. 16 outlines the BTRON software configuration. 


14 IEEE MICRO 


0272-1732/89/0600-0014$01.00 © 1989 IEEE 




255, with 0 being the highest priority. 
Scheduling proceeds based on the assigned 
priority. For this purpose, processes are classified 
into three groups: absolute priority (priority be¬ 
tween 0 and 127), round-robin 1 (priority between 128 
and 191), and round-robin 2 (priority between 192 and 
255). Scheduling takes place as follows: 


The operating system nucleus provides basic 
services such as process management, interprocess 
communication, synchronization, and a systemwide 
naming space. Let’s look at some of the features. 


Operating system 
nucleus 


Process management. Processes are discrete units 
of the overall processing involved in program execu¬ 
tion. A process ID (>0), given to a process when it is 
created, distinguishes the different processes. Pro¬ 
cesses have four basic types of status (see Figure 2 on 
p. 17): 

• Nonexistent. The process has not been created. 

• Ready. The process is executable and waiting to be 
dispatched. 

• Running. The process is executing. 

• Waiting. The process is waiting for a message, the 
passage of an indicated amount of time, input/output, 
and so on. 

Processes go through the following transitions from 
one status to another, as a result of system calls or 
scheduling (dispatching, preempting). 

Process priority and scheduling. When a process is 
created, it receives a priority numbered between 0 and 


1) If there are processes in the absolute priority group 
on ready status, the process with the highest priority 
changes to running status and executes. Otherwise, 
scheduling moves to step 2. 

2) If there are processes in round-robin group 1 on 
ready status, scheduling assumes relative priority (ex¬ 
plained later). The selected process (not necessarily the 
one with the highest priority) changes to running status 
and executes. Otherwise, scheduling moves to step 3. 

3) If there are processes in round-robin group 2 on 
ready status, scheduling assumes relative priority. The 
selected process (not necessarily the one with the high¬ 
est priority) changes to running status and executes. 
Otherwise, scheduling starts over from step 1 above. 

Relative priority is a dynamic priority that is calcu¬ 
lated using the statically assigned priority, CPU time 
usage of the process, and frequency with which the 
process has been scheduled. Relative priority keeps a 


June 1989 15 






BTRON/286 


The BTRON/286 
Hardware 

The BTRON/286 is currently the only TRON 
Project implementation that can run on hardware 
within reach of an individual user. The specification 
of the hardware appears in 
Table A and is comparable 
to an IBM PC AT. Al¬ 
though a hard disk is not 
mandatory, designers 
highly recommend it. 

Since the BTRON/286 
executes the protect mode 
of the 80286 CPU, 16 
megabytes of address 
space can be accessed. 

Figure A is a photograph 
of the computer that runs 
the BTRON/286. 


Table A. 

Specifications. 

Unit 

Description 

CPU 

Intel 80286 

Clock 

8 MHz 

Main memory 

3 megabytes 

Floppy disk 

3.5 inch, 1 megabyte X 2 

Hard disk 

20 megabytes (optional) 

Display resolution 640 X 400 

Display colors 

16 out of a 4,096 palette 

Keyboard 

The TRON keyboard 



Figure A. An 80286-based computer runs the 
BTRON/286, an operating system based on 
the BTRON specification. The machine con¬ 
tains a special Video Processor Unit that 
makes it possible to superimpose video im¬ 
ages from external sources on the screen. 


User applications 


Basic text editor and basic graphic editor (system applications) 


Extended nucleus 




Display 

primitives 


OS nucleus 


Memory 

Process 

Clock 

System 

File 

management 

Event 

management 


management 

management 

management 

management 

Device management 


Device drivers 


Hardware devices 


Figure 1. BTRON software configuration. 


16 IEEE MICRO 








































































































Video Image Handling 
on the BTRON/286 Implementation 


The BTRON/286 hardware uses an optional Video 
Processor Unit to handle video image data. The VPU 
permits external video signals to be displayed on the 
screen of a personal computer, then digitized and 
stored into internal frame memory. Users can en¬ 
large or shrink the image as well as repeat small 
images (tiling). Also, the digitized images can be 
shown superimposed on the computer screen. Other 
operations such as inversion and logical operations 
on the digitized images are available. With this VPU 
users can manipulate animated images taken from a 
videodisc or videocassette recorder system. 

The extended nucleus of the BTRON/286 operat¬ 
ing system contains a built-in software module called 
the video manager. This module isolates the applica- 

Figure B. Video, graphics, and text windows 
opened simultaneously on the BTRON/286 screen. 


tion programs from the details of the video hardware 
and offers a set of logical interface functions to 
access the video images. This isolation makes it very 
easy to develop application programs for handling 
many digitized images. Figure B is an example of an 
application that uses the video manager on the 
BTRON/286. 


CE 



l 

' 

? ,V' r rr. • 

1 .■ . 1 

* . ’i ; 

ransrisF^Tn 



HUMS 



low-priority process from waiting forever. 

Normally, system processes and real-time processes 
belong to the absolute priority group, while ordinary 
application processes belong to the other two groups. A 
process in the absolute priority group (priority between 
0 and 127) can lock or unlock itself and can become 
resident in memory by locking the memory space of the 
process in memory. Such a process is said to be on lock 
status. The priority of a process once it has been created 
can be changed only within the same group; it cannot be 
shifted to a different priority group. 

Message communication among processes. Each 
process contains a message queue in which messages 
are communicated among processes. A message moves 
to the message queue belonging to a process according 
to the destination indicated in its process ID. See Figure 
3. The source is likewise identified by means of the 
process ID in the sending process. A message queue has 
a FIFO (first in, first out) structure, so that messages 
always enter the queue in the order in which they are 
sent. If the message queue at the destination is full when 
a message is sent, the sending process receives an error 
code that tells it to wait until space becomes available. 
Certain types of messages can be received selectively 
by specifying their types. 

Synchronization and control among processes. 
Counting semaphores act as mechanisms for synchro¬ 
nization among processes and for mutually exclusive 
usage of shared resources. A semaphore ID assigned at 


i-1 



Figure 2. Basic status transitions of processes. 


Sending process Receiving process 



Figure 3. Message communication among processes. 


June 1989 17 































































BTRON/286 


<Release> 
{+ released count value} 


<Creation>/ 

Reinitialization^ 

{= initial count value} 


<Acquisition> 
{- acquired count value} 


Processes releasing a semaphore 



Semaphore (designated by semaphore ID) 

Current count value (>=0) 


I 




I 


I 


- <Deletion> 



Processes waiting for semaphore acquisition 


Figure 4. A semaphore operation. 


<Creation/data modification> 


<Data fetch> — 
Can be referenced 
from all processes 


Name (8 characters) 
Data (32 bits) 


• <Deletion> 


Figure 5. A global name data operation. 


the time of creation identifies the dynamically created 
semaphores. Once a semaphore has been created, it can 
be used by every process. Figure 4 shows the sema¬ 
phore operation. 

Global name data management. Global name data 
can be defined as data with a globally visible name. 
Global name data can be shared among processes so 
they can exchange a small amount of data with high 
speed. Such data contain freely chosen names up to 
eight characters long by which all processes can refer¬ 
ence and modify the data, as seen in Figure 5. The two 
kinds of global name data include one that continues to 
exist regardless of the existence of the process creating 
or modifying it and that must explicitly be deleted when 
no longer needed. (In BTRON/286, the modification of 
the global data is treated as if it were a creation of new 
data.) A second kind is automatically deleted when the 
process creating or modifying it is terminated. Either 
kind can be designated when creating or modifying 
global name data. Global name data mainly allow 
certain data to be referenced by any process. 

18 IEEE MICRO 


Memory management. The BTRON specification 
provides for management of local and shared memory. 

Local memory. This type of memory can be used in 
only one process and cannot be accessed directly by 
other processes. Local memory is created automati¬ 
cally at the same time a process is created and is 
released automatically along with the process at the 
time the process terminates. Application processes can 
acquire and use memory blocks of the required size 
from local memory. The acquired memory block con¬ 
tains continuous (logical) addresses, and the start 
(logical) address is returned to the application. 

Shared memory. This type of memory can be used by 
all processes for data that are shared by several pro¬ 
cesses, for data used by the extended nucleus, and for 
other purposes. A number of memory pools make up a 
shared-memory area, though at system start-up, only 
the system memory pool exists. Application processes 
acquire shared-memory blocks of the specified size 
from the specified memory pool, a special memory pool 
consisting of the entire available system memory. 
Processes dynamically create ordinary memory pools 
that are identified by a memory pool ID given at the 
time of creation. Creation of a memory pool amounts to 
allocating part of the system memory pool and making 
it a separate, discrete memory pool. See Figure 6. 

File management. The file structure adopted in the 
BTRON specification maps the real/virtual object 
model for the actual hardware. BTRON/286 users will 
manipulate the data based on this model, but the under¬ 
lying data structure is implemented as files, as in con¬ 
ventional systems. Note that 
















• Files are structured as record streams; that is, they 
consist of ordered series of variable-length records. 

• A random network of reference relationships exists 
among files based on links (virtual objects) included in 
files. (No directory exists as in conventional file sys¬ 
tems.) 

• These links (virtual objects) access files directly. 

• Access is controlled via a user access level number 
(0-15) that is assigned to each user and a file access 
level number (0-15) that is assigned to each file. 

Files and links. As just mentioned, a file is a stream 
of ordered, variable-length records that corresponds to 
a real object. A link, corresponding to a virtual object, 
points to files and serves as a clue for file reference. A 
link can be embedded in any file as a record, and more 
than one link can point to the same file. In this way an 
overall network of reference relationships is defined. 
Links indirectly reference files, and as a result the name 
of a file does not have an absolute meaning to uniquely 
identify a file. Rather, it functions as one search key. 
Any file name of up to 20 characters can be assigned, 
and the same name can be assigned to more than one file 
if desired. See Figure 7. 

File structure. Every file system has one root file. By 
tracing the links included in the root file, users can 
reach all files in the file system, as shown in Figure 8. 
In the real/virtual object model, a root file corresponds 
to a device’s real object. When a file system is created, 
it acquires a file system name up to 20 characters long 
and a device location name. The system and the users 
need the file system name, and the name of the root file, 
for absolute identification of the file system. The de¬ 
vice location name, also up to 20 characters in length, 
indicates the physical device in which the file system is 
stored. 

Attaching the file system. At system start-up, no file 
system exists. Before one can be used, an attachment 
operation as shown in Figure 9 must take place, whereby 
the name of the logical device in which the file system 
exists and the attaching name are specified. Detaching 
an already attached file system is done by specifying 
the name of the logical device to the detaching com¬ 
mand. As a result, one cannot access a file via links that 
reference the files in the detached file system. The links 
are said to be on detached status. In the real/virtual 
object model, a link on detached status corresponds to 
a shadow object. 

File ID. A file ID is a unique 16-bit number assigned 
to all files in the file system. The file ID of the root file 
in a file system is always 0. 

Direct and indirect links. A fixed link, a link stored 
in a file as one record, can reference only files in the 
same file system. A link file is a special file to control 


System 
memory < 
pool 

(= entire 
available 
system area) 


Memory pool 


Memory pool 


Memory pool 


-► Application 
processes acquire 
and use shared- 
memory blocks of 
the specified size 
from the specified 
memory pool. 


Figure 6. System memory pool. 


File (real object) 



Figure 7. Files and links. 



Figure 8. File system structure. 



<File system A> <File system B> 


Figure 9. A file system attachment. 


June 1989 19 














































































BTRON/286 


Link A 

□— 


<File system X> 

File BB (fixed link) 

O • 


Reference 


<File system Y> 



Figure 10. An indirect link and a link file. To store link A in file BB, first 
create link file a then store indirect link A' to a in file BB. 


a reference to different file systems for configuring an 
indirect link. References to files in different file sys¬ 
tems must be made via a link file in the originating file 
system. Links to link files are called indirect links 
because they add another level of indirection. As can be 
seen in Figure 10, references to files in the same file 
system can be made via a direct link stored in a file. An 
indirect link is a fixed link to a link file; a direct link is 
a fixed link to an ordinary file. 

Event management. This capability uniformly 
treats keyboard and pointing-device operations as 
events by which users can interact with the computer. 
See Figure 11. These operations are recorded as one 
event after another in the systemwide event queue. 
Applications fetch these events one after another from 
the event queue, with the corresponding action being 
carried out in an event-driven form. This arrangement 


is based on the rule that, at a given 
point in time, only the process 
engaged in dialogue uses the 
event management capability to 
fetch events. The BTRON/286 
specification defines the follow¬ 
ing types of events: 

• button down, 

• button up, 

• key down, 

• key up, 

• automatic repeat key, 

• device, 

• null, and 

• application events 1 to 8. 

Each event is associated with a 
bit mask, which identifies the 
event a process has fetched. 

Device management. Device management functions 
include an application interface for the uniform han¬ 
dling of various devices, device-driver registration and 
management for various devices, and a device-driver 
interface. See Figure 12. The file management and 
event management capabilities in the operating system 
nucleus perform the actual device operations. A logical 
device name is a unique character string registered in 
the system and consisting of class, unit, and subunit 
elements. A class name indicates the type of device, 
such as hd for hard disk ,fd for flexible disk, and so on. 
A unit distinguishes individual devices when a number 
of units of the same class exist. Normally, a single letter 
of the alphabet, starting from a, identifies units, for 
example, hdaO, hdal for hard disk hda. When one unit 
is logically subdivided, a subunit distinguishes each 
part using numbers from 0 to 254 assigned in order. 


Process engaged in dialogue 


i- 


_i ,-, 

i i i 

i i i 

_i i_i 


Event queue 


Event 


Event 


Event 




Application 


i process 

Dialogue with 
users is passed 
along from one 
process to 
another. 


Keyboard or pointing device 


Application process 


t Application 
interface 


File management, event management, 
etc. - 


Device management 


I t 

i-1 

[ Device driver [ 


A Driver 
* T interface 

I-1 

! Device driver ! 


i I it 

Actual device Actual device 


Figure 11. Event management. At a given time, only Figure 12. Device management, 

the process engaged in dialogue fetches an event, 
making use of the event-management capability. 


20 IEEE MICRO 

















































Panel plane 


Panel 


Panel 


1 


Panel 



Panel 


Window frame 


Working area 


(b) 


Drawn by 
the window 
management 
capability 

Drawn by 

application 

process 


Panel 


Floating type —i— Window 
I— Other 


- Managed by the window 
management capability 


— Modal type 


(a) 


— System-message panel 


■ Dialogue . Managed by the panel 
panel management capability 


Figure 13. Panel (a) and window (b) management. The panel plane is the operation environment space on the 
display screen. 


Extended nucleus 

As shown earlier in Figure 1, the software contains 
many functions. Here, we explain three special features 
of the BTRON operating system: 

• panel management, 

• real and virtual object management, and 

• TAD (the TRON Application Databus). 

Panel management. The BTRON specification 
designates a rectangular region on the (virtual/real) 
display screen as a panel. A panel can be placed in an 
overlapping manner any place on the screen and dis¬ 
played via programs. When a hidden panel is exposed 
(say, if a panel that obscures another panel is moved), 
the newly exposed area in the panel is redrawn to show 
the data that should be visible on the screen. The 
display of each panel is independent of what is shown 
in other panels. 

When different panels appear on the screen, only one 
panel can accept a user’s keyboard input. The user must 
click the button on the pointing device (a penlike 
accessory) to change the status of the panels so that a 
different panel can accept input. The BTRON specifi¬ 
cation makes it possible to realize a panel (windowing) 
system with a minimum of hardware and memory. A 
basic assumption is that no special hardware is 
supported. 


Panel types. Panels can be classified into modal, 
floating, and system-message categories, as shown in 
Figure 13. Users call up modal panels on the screen via 
application programs; these panels can’t be moved 
once they are displayed. Modal panels are always 
exposed unless hidden by other modal panels. When a 
modal panel is displayed, it is the only panel with which 
a user can interact. 

A user can relocate floating panels by dragging the 
pointing device over the screen. When a floating panel 
can accept input, the user can choose to click the 
pointing device and access another panel so it will 
accept input. A floating panel that shows the content of 
a real object and allows users to interact is called a 
window in BTRON terminology. Usually, a window 
shows a part of a real object, and the user scrolls the 
screen to take a look at different parts of the object. 
Floating and modal panels that are not windows are 
called dialogue panels. Dialogue panels often offer the 
user an interface for setting application parameters. 

The system-message panel is a long, narrow panel 
that appears at the bottom of the display screen. Only 
one system-message panel can appear on a screen. It 
holds system messages and is never hidden. 

Windows can be extended to occupy the full screen 
(save, of course, the system-message panel) and are 
said to be in full-screen mode. The user can easily 
change a window so that it occupies either a small 
region or a full screen. 


June 1989 21 
































BTRONI286 






Figure 14. Standard window shape and title bar dis¬ 
play (a) and a window in the input enable status (b). 


Window> display. Figure 14 shows the standard win¬ 
dow shape and standard title bar display. 

Window? operations. Please recall that a window is a 
type of panel. There are five window operations: 

• Creating a window. The user executes an applica¬ 
tion using a menu and clicks twice on a pictogram so 
that the selected virtual object is opened in the window. 

• Switching input enable status. The user either 
clicks on the inside of the window with the pointing- 
device button or selects the real object name from the 
display management menu. (This allows the window to 
move to input enable status and become the first front 
window.) 

• Moving a window. The user drags the pointing de¬ 
vice over any part of the window frame other than a 
handle or scroll bars. 

• Resizing a window. The user drags a handle inside 
the window frame to choose an arbitrary window size or 
a full-screen window. For a full-screen window, the 
user either selects the full-screen mode from the menu 
or clicks twice on the handle. Users can switch between 
the full-screen mode and normal (original) size easily. 

• Erasing a window: The user either selects Quit 
from the menu or clicks twice on a pictogram. 

Pointer shape. The screen displays a pointer shape 
(in the form of a hand for easy understanding; see 


Figure 15), which indicates the active operation. When 
neither a pointing-device button, menu button, nor 
Command key is pressed, the pointer shape depends on 
the pointing-device position and automatically changes 
to a shape that indicates the operation possible in that 
position. (On a standard TRON digitizing pen, the 
pointing-device button is located at the tip of the pen 
and the menu button in the middle of its side. The 
Command key appears on the TRON keyboard.) 

Real/virtual object management. These functions 
let the user display and manipulate real and virtual 
objects, as well as register and delete application pro¬ 
grams. They also allow those application programs 
making use of real/virtual object management to dis¬ 
play and manipulate real and virtual objects. In addi¬ 
tion, all application programs fundamentally are exe¬ 
cuted via real/virtual object management. The real/ 
virtual object management functions include: 

• various functions for display and manipulation of 
virtual objects, 

• various functions for display and manipulation of 
fusen, 

• application program management (registration/de¬ 
letion/ execution), 

• menu setting, 

• file system-mounting management, and 

• other application functions. 

Virtual object operations. Listed here are the opera¬ 
tions possible with regard to virtual objects. Users 
activate these operations directly on the screen either 
by using a pointing device or by selecting an item from 
a virtual object operation menu. The operations are 

• selection, 

• removing a disk, 

• copy or movement, 

• resizing, 

• opening, 

• closing, 

• modification of a real object’s name, 

• modification of a relation, 

• creation of a new revision, 

• display of virtual/real object management informa¬ 
tion, 

• opening as a window, 

• modification of display attributes, and 

• registration or deletion of an application. 

Physical implementation. These objects are mapped 
to physical data that consist of many types of records. 
For example, a real object can consist of records that 
contain application programs, program-specific data, 
and TAD (explained next) format data that can be 
shared between applications and many systems. Each 
record of these objects contains subdivisions called 
segments. 


22 IEEE MICRO 



























TAD (TRON Application Databus). The BTRON 
specification provides a uniform data format called 
TAD, which permits the ready exchange of data among 
different applications. A TAD record makes up a part of 
the real objects in BTRON. 

From the standpoint of assuring data compatibility 
between applications, TAD can be seen to have these 
four features: 

• guaranteed compatibility of basic data (text, figure, 
and sound) across applications; 

• variable-length segment format for easy data- 
exchange processing; 

• the capability to hold application-dependent data; 
and 

• a realization of the real/virtual object model in ex¬ 
changed data, allowing external data referencing. 

The fusen, handling application-specific data in 
TAD. TAD specifications are designed to guarantee 
data compatibility across applications at a minimum 
level. The TAD format is structured of text and figures 
as basic data, based on the notion that text and graphics 
are two kinds of data that can be handled by any 
application. However, if we define data format for text 
and figures in a very restricted way, many programs 
may suffer from the lack of flexibility and loss of 
efficiency. To avoid this problem, TAD permits an 
application-specific data format and clearly separates 
the application-specific data and application-independ¬ 
ent data. 

Application-specific data, such as a database index 
or a special binary data structure, are stored as fusen (a 
Japanese word for a small piece of paper used as a 
memo pad). Fusen is a generic name for application- 
specific segments. Although the content of a fusen is 
application-specific, its external format follows a cer¬ 
tain rule. Among others, the application name that can 
interpret such fusen is stored in a defined format at the 
beginning of each such fusen segment. So when an 
application encounters a fusen segment, it can safely 
ignore the segment since it knows the length of the 
segment. In a sense, BTRON standardizes the way 
nonstandard data is stored. 

Fusen can be stored like virtual objects inside a real 
object and are displayed as rectangular regions on the 
display. However, unlike virtual objects, each fusen 
itself holds some data internally. (A virtual object is a 
link to other real objects.) 

Usually the user can change the content of a fusen by 
showing it on the screen and then invoking the associ¬ 
ated application by clicking to have a dialogue panel 
appear. The user actually changes fusen by interacting 
through the dialogue panel. 

Fusen can be classified broadly into the following 
types: 

• Functional. This type of fusen holds data necessary 
for invoking application programs. If a functional fusen 


Moving 




(Pointing device pressed) 



Jump Drag ( Pointin 9 device 

scrolling scrolling pressed) 

(a) 



o 

Resizing 

O 

(Pointing device 
pressed) 


O O On nO ^ O 

oaaOOo 

dxxO on°Oo OODnn <70 O 

OXTO OD oo aa°° C&o 


(C) 


TRON keyboard 


Figure 15. The pointer shape interacting with the screen 
(a), the pointing device (b), and the TRON keyboard (c). 


is embedded in a real object, the fusen can specify the 
application that works on the real object. 

• System setting. This type of fusen modifies system 
settings such as the volume of a beeping sound, which 
repeats automatically whenever a key is held down for 
a specific length of time. 

• Adjective. This fusen is embedded in the text or 
figure record of a real object. An application will use 
the adjective fusen to interpret data with added detail. 
For example, font information or format information 
for a character is stored as adjective fusen in a TAD 
format data. The basic text editor, which is available on 
any BTRON-based system, can interpret the adjective 
fusen and act accordingly. 

The following manipulations can be performed on 
fusen: 


June 1989 23 





















BTRON/286 



LEN bytes 


Figure 16. The structure of a variable-length segment. 



• LEN bytes 


Figure 17. The structure of a text fusen segment and a 
figure-drawing segment. 


• selection (same as with a virtual object); 

• copy or movement (same as with a virtual object); 

• modification of the name (same as modifying a real 
object name); 

• opening or setting parameters (same as opening a 
virtual object as a panel; the corresponding application 
is executed, and normally a dialogue panel is dis¬ 
played); and 


• modification of display attributes (same as modifi¬ 
cation of display attributes of a virtual object, except 
that the setting items are virtual object subsets). 

As an example, a spreadsheet or database application 
adopting a tabular organization consists of cells. The 
attributes of each cell and information on table struc¬ 
tures are stored as adjective fusen in TAD. The text data 
can be handled by any application, whereas only certain 
applications can process the adjective fusen. 

T AD guarantees data exchange, at least of the data in 
each cell of the table-based application, as character 
strings. In this way various applications can make use 
of the table data at the most basic level, that of charac¬ 
ter-string data. 

Variable-length segments. TAD consists of struc¬ 
tured data called variable-length segments or variable- 
length data having the structure OxFF + segment ID + 
data length + parameters, as shown in Figure 16. The 
OxFF code at the beginning of the segment equates with 
the escape code in the TRON code system. OxFF indi¬ 
cates the start of the segment, and the segment ID 
indicates the type of fusen. The data length gives the 
data size of the parameters that follow. The fusen 
segments and figure fusen segments contain a sub ID 
giving further information on the type of segment. In 
this case the structure appears as shown in Figure 17. 

We gain two advantages from classifying types of 
fusen and adopting a structure that includes segment ID 
and byte-length fields. Applications unable to process 
a segment can skip it, based on data-length information, 
and go on to process the next data. This means that 
applications are not required to support every segment, 
and there is less burden on the system developers. Sec¬ 
ondly, there is no need to maintain a large fixed-length 
table of formatting information. 

Holding application-dependent data. The applica¬ 
tion-dependent data that are not covered by the TAD 
specifications can be stored in three possible ways: in 
the application program record, in the adjective fusen 
record, and in the text-application fusen or figure- 
application fusen. 

The basic mechanism adopted in BTRON to associ¬ 
ate the application-specific data and the application 
program itself is to embed a segment that holds the 
application ID. This ID designates the application that 
can interpret the data. The choice of the method to use 
in embedding the application ID is not specified at this 
moment. We believe the necessary underlying mecha¬ 
nisms support the idea of storing the application-spe¬ 
cific data in TAD to promote the maximum portability. 

Virtual object segments. TAD realizes the real/vir¬ 
tual object model by means of link records and virtual 
object segments. A link record holds link information 
to a real object, while a virtual object segment holds 
information for visually displaying a virtual object. 


24 IEEE MICRO 





















Specified fusen record 


TAD main record 



Specified fusen segment Text-application fusen 


Figure 18. Data representation based on the real/virtual object model. 


When data contain a virtual object, TAD saves the 
virtual object segment by embedding it in the text or 
figure data. The position of the segment shows the 
relative position of the virtual object in the data, and the 
order of appearance of real images defines the corre¬ 
spondence between link records and virtual object 
segments. Figure 18 shows how data are represented 
when two virtual objects exist in the same set of text 
data. 

Other standard formats allow data in other files to be 
referenced, but TAD offers the advantages of referring 
to the network-type file and of preserving virtual infor¬ 
mation. 


B TRON/286 implements the BTRON human- 
machine interface on an 80286-based hardware 
system. Please note that the BTRON human- 
machine interface can be implemented on different 
hardware platforms and can be implemented in many 
ways. Currently, work is proceeding to develop a 
BTRON human-machine interface that runs on the 
TRON VLSI CPU. In this implementation, designers 
emphasize distributed operating system functionality. 
We hope we can report on the next development in a 
year or two. iff 


References 

1. K. Sakamura, “BTRON, The Business-Oriented Operat¬ 
ing System,” IEEE Micro, Vol. 7, No. 2, Apr. 1987, pp. 
53-65. 

2. K. Sakamura, ed., “TRON Project 1988: Open Architec¬ 
ture Computer Systems,” Proc. Fifth TRON Project 
Symp., Springer-Verlag, Tokyo, 1988. 

3. BTRON/286 Specification, The TRON Association, 
Tokyo, Dec. 1988. 



Yoshiaki Kushiki KazuhiroOda 


Ken Sakamura's biography, picture, and address appear on 
p. 13. 

Yoshiaki Kushiki currently works in Matsushita Electric 
Industrial Co., Ltd.’s Information Systems Research Labora¬ 
tory. He has been engaged in the research and development of 
system architecture, such as operating systems, databases, the 
human-machine interface, and artificial intelligence. He re¬ 
ceived the BE degree in electronics engineering from Kyoto 
University in Kyoto and is a member of the IEEE. 

Kazuhiro Oda is a chief specialist in the Personal Computer 
Design Department at the Ome Works of Toshiba Corpora¬ 
tion. He is now a member of the BTRON and CTRON 
technical committees of the TRON Association. He has been 
engaged in software design and development of large and 
medium-scale computers, small business computers, distrib¬ 
uted processors, and word processors. Oda received the BE 
degree from Kyushu University in Fukuoka. 


Reader Interest Survey 

Indicate your interest in this article by circling the 
appropriate number on the Reader Service Card. 

Low 153 Medium 154 High 155 


June 1989 25 










































FEATURE 


A Floating-Point VLSI Chip 
for the TRON Architecture: 

An Architecture for Reliable 
Numerical Programming 


Preliminary 
evaluations of 
this FPU show a 
good combination 
of precision and 
performance. 

It supports IEEE 
floating-point 
arithmetic and 
the ANSI C draft 
proposal. 


Shumpei Kawasaki 
Mitsuru Watabe 
Shigeki Morinaga 

Hitachi Ltd. 


T he TRON project 1 advocates a network system consisting of hetero¬ 
geneous computing nodes called intelligent objects that have a wide 
range of functions. Figure 1 illustrates how a standard network 
connects the intelligent objects so that they can communicate with each 
other. The TRON architecture also specifies a processing element to be used 
as a component of the intelligent objects for the VLSI CPU, 2 a standard 
instruction set for a central processing unit. Multiple vendors will provide 
the Gmicro microprocessor family, 3 a series of VLSI (very large scale 
integration) chips conforming to the architecture. Among these micropro¬ 
cessors are the Gmicro/100, Gmicro/200, and Gmicro/300. 

While the TRON architecture for the VLSI CPU handles integer, bit- 
field, string, list-structure, and bitmapped data, it excludes the floating¬ 
point data intended to be handled by the coprocessor or library. As some 
intelligent objects will undoubtedly handle real number data, floating-point 
instructions are urgently needed in the architecture for the VLSI CPU. With 
this in mind, we designed the Gmicro/FPU to provide floating-point instruc¬ 
tions for both the Gmicro/200 and the Gmicro/300. The VLSI CPU architec¬ 
ture defines 23 coprocessor instructions, some of which are designed to be 
used in the floating-point instructions. 

Since the TRON project is meant to provide information infrastructure 
for various layers of society, the reliability of its parts is the key issue. The 
intelligent objects can participate in navigation, construction, manufactur¬ 
ing, chemical-plant control, air-traffic control, and even exploration of the 
universe. Therefore, their reliability can affect the lives of individuals and 
the public in general. As these applications massively utilize floating-point 
numbers, the numerical integrity of the Gmicro/FPU potentially could 
become a social issue. Thus, though performance is an important criterion 
in the development of the Gmicro/FPU, accurate data are even more 
important. 4 

Background 

A floating-point number is an attempt to express real number data in 
digital information. A conventional floating-point number consists of a sign 
bit, exponent field, and mantissa field as seen in Figure 2. 


26 IEEE MICRO 


0272-1732/89/0600-0026501.00 © 1989 IEEE 







The value of a floating¬ 
point number is deter¬ 
mined by: 

value = (-1) ? * 2 < ’~ bias * m 

where 5 is the sign bit, e is 
the exponent, and m is the 
mantissa. Bias is a con¬ 
stant that enables the 
floating-point number to 
express a value larger or 
smaller than one. A wider 
exponent field would in¬ 
crease the range of the 
value expressed by the 
format. A wider mantissa 
field would increase its 
resolution. 

Floating-point hard¬ 
ware and software. Prior 
to 1960 computers used 
instructions to handle in¬ 
tegers and programmers 
wrote floating-point soft¬ 
ware libraries utilizing 
these instructions. The 
programmers found it dif¬ 
ficult to write a floating¬ 
point library that executed efficiently and yet was 
economical in memory usage. Mainframe manufactur¬ 
ers introduced special floating-point arithmetic instruc¬ 
tions in the 1960s, drastically improving the execution 
speed and economy of memory. In the 1970s minicom¬ 
puter vendors also started to supply floating-point 
instructions on high-end products. Unfortunately, each 
manufacturer choose different floating-point data 
formats. Exchanging the floating-point data and 
floating-point software between two different 
machines presented major difficulties. 

To resolve the difficulties, in the late 1970s the 
Institute of Electrical and Electronics Engineers pro¬ 
posed a floating-point standard (now called the ANSI/ 
IEEE Standard 754-1985 for Binary Floating-Point 
Arithmetic * 1 2 3 * * 6 ). Microprocessor manufacturers have 
unanimously accepted this standard. It explicitly de¬ 
fines the essential hardware features of floating-point 
arithmetic, including the floating-point data format, 
semantics of operation, conditional branch mecha¬ 
nisms, and exception handling. All conforming sys¬ 
tems give the same result for each operation. 

Writing floating-point software in a high-level lan¬ 
guage, or HLL, increases the software’s reliability, 
portability, and maintainability. However, when writ¬ 
ten in the current HLLs, 7 numerical software often 
suffers a decrease in reliability. It is impossible for a 
programmer to determine the exact floating-point 
operations that will be executed for a particular source- 



Figure 1. The standardized network structure linking "intelligent objects" advocated 
by the TRON project. Details of the construction can be found in Sakamura. 5 Stan¬ 
dardized operating systems are proposed. 


Sign 


Exponent 


Mantissa 


Figure 2. The components of a conventional floating¬ 
point number. 


level construction of an HLL. The floating-point opera¬ 
tions generated from an HLL source code remain un¬ 
predictable in the eyes of the programmers. Some of the 
mysteries often encountered in HLL numerical pro¬ 
gramming include the following: 

1) Commutative law (jv + y = y + x) does not hold in 
many HLL compilers. 

2) In Fortran one can test two variables for inequality 
at one point in a program and find that they are not 
equal, and then test them again later to find that they are 
equal. 

3) A floating-point program often gives different 

results before and after a code optimization by an HLL 

compiler. 

To eliminate the unpredictability and ambiguities of 
floating-point arithmetic in the C language, the Ameri¬ 
can National Standards Institute released a draft pro¬ 
posal for the C language. 7 C when implemented accord¬ 

ing to this draft proposal makes the exact floating-point 
operations visible to the programmer. 


June 1989 27 































TRON FPU 


Summary of 
Floating-Point 
Standards 

One industry standard and one standard pro¬ 
posal define the boundary conditions of future 
floating-point hardware implementations. The 
purpose of both of them is the portability of 
floating-point software among various systems. 
One calls for more floating-point software to be 
written in a high-level language. 

The ANSI I IEEE 754-1985 Standard for Binary 
Floating-Point Arithmetic explicitly defines 
essential functions of floating-point arithmetic 
hardware and software. However, it defines the 
functions as floating-point operations without 
regard to other software support such as that in 
high-level languages. The standard sets 

• data formats (32-bit single, 64-bit double, 
and 80-bit extended-double precisions); 

• six standard arithmetic operations (addition, 
subtraction, multiplication, division, square root, 
and remainder), requiring a precise solution for 
each; 

• branches on floating-point conditions; 

• binary and decimal conversion of floating¬ 
point data, 

• rounding methods; and 

• exception traps and handling methods. 

The Draft Proposed American National Stan¬ 
dard for Information Systems—Programming 
Language C determines the exact floating-point 
operations that will be executed for a particular 
source-level construction. The programmer can 
estimate what floating-point operations are gen¬ 
erated just from the C source program. The draft 
proposal sets 

• precisely defined rules for expression evalu¬ 
ation. In the type promotion rule the binary opera¬ 
tors (+, -, *, /) must calculate the result in the data 
format of the operand(s) with the highest preci¬ 
sion. Examples of the type promotion rule 
include: 

<long double> + < double> -> clong double> 

<float> * <double> —> <double> 

<double> - <double> -> <double> 

• strict rules to eliminate optimization result¬ 
ing in the loss of exactness of operations, 

• no change of operation precedence, and 

• 21 well-defined mathematical functions. 


ANSI is developing the draft proposal to “provide an 
unambiguous and machine-independent definition of 
the language C.” 7 The standard lets a programmer 
determine the exact floating-point operations that will 
be executed by adjusting the source-level construction 
of C. Most of the mysterious phenomena seen in HLL 
floating-point programming can be eliminated by ex¬ 
plicit rules for the precision of intermediate values in 
expression in the compiled code and rigorously de¬ 
fined, standard mathematical library functions. Most 
HLL compilers capriciously choose the precision for 
intermediate expression in the compiled code, which 
explains most of the mysterious phenomena described 
earlier. A programmer can now write numerical soft¬ 
ware without fearing the hidden “features” of an HLL. 

Though the draft proposal sets no specific require¬ 
ments on floating-point hardware, conventional float¬ 
ing-point units are not suited for efficient execution of 
a system based upon the draft proposal. Thus designers 
must give the floating-point hardware more dexterity in 
controlling the precision of the intermediate values to 
handle the exact operation demanded by the ANSI draft 
proposal. The Gmicro/FPU contains an architectural 
feature that adjusts to the draft proposal without 
impairing performance. See the box for a summary 
of the IEEE standard and the ANSI draft proposal 
documents. 

The precision issue. The IEEE floating-point docu¬ 
ment standardizes the precision of basic mathematical 
operations such as add, subtract, multiply, divide, 
square root, et cetera. No precision requirement for the 
elementary functions is included; it is left up to the 
implementer. (Elementary functions are frequently 
used in scientific arithmetic and include trigonometric, 
hyperbolic, exponential, and logarithmic functions.) 

When floating-point hardware is used in critical 
applications such as navigation, the precision of an 
elementary function matters a great deal. A fraction of 
error in a trigonometric function can lead you many 
miles away from your destination. Hardware designers 
cannot pass this problem to the software designers, 
since they are in a position to determine any trade-off 
between performance and precision. Since the software 
designers must compete with each other about the 
performance of the total system, they can hardly im¬ 
prove the precision at the cost of performance. On the 
other hand, the hardware designers are more likely to be 
the ones to choose the bit length of the arithmetic unit 
and the constants. They are also the ones to have the 
options in implementing special hardware without in¬ 
troducing deterioration of the performance. 


Floating-point units 

Most microprocessor manufacturers prefer to inte¬ 
grate the floating-point arithmetic functions on a sepa- 


28 IEEE MICRO 






rate chip from the CPU known as a floating-point unit. 
Familiar FPU examples include Intel’s i80387 designed 
for use with the i80386 CPU and Motorola’s MC68882 
designed to accompany the MC68020 CPU. There are 
two reasons for producing a CPU and FPU separately. 
The first is the system’s requirement: Floating-point 
functions are not used in all systems. The second is the 
nature of VLSI production: The yield Y of a VLSI chip 
is expressed by 

Y = a * e~ D * s 

where D is the density of defects, S is the die area, and 
a is a coefficient factor common to all technology. The 
technology determines the density of defect D. For 
high-end, state-of-the-art microprocessors, the yield is 
very sensitive to small changes in the die area. Thus, 
implementing the entire CPU function and floating¬ 
point arithmetic calculations on one die would cause 
the yield of the chip to drop substantially, resulting in 
a more-costly chip. 

Coprocessor interface. An instruction extension 
chip such as an FPU designed to accompany a VLSI 
CPU is often called a coprocessor. To reduce the 
number of interchip interconnections, designers build 
an autonomous control system into the extension chip, 
making it a cooperating processor rather than an exten¬ 
sion of the CPU’s logic. The instructions extended by 
the coprocessors are called coprocessor instructions. 
When a VLSI CPU encounters a coprocessor instruc¬ 
tion, it generates a series of handshake procedures on 
the coprocessor chip. These procedures, called 
coprocessor protocols, perform the transfer of the in¬ 
struction information, data, and state, while leaving the 
execution of the instruction to the coprocessor. The 
CPU normally does not “know” the content of coproces¬ 
sor instruction operation, leaving the detailed defini¬ 
tion of coprocessor instructions to a later time when the 
coprocessor is actually implemented. 

The TRON architecture for the VLSI CPU includes 
23 coprocessor instructions, which cover all probable 
interactions with coprocessors to be defined in the 
future. Since the coprocessor protocol is not part of the 
architecture and its implementation is left to the manu¬ 
facturers, the protocol can be defined so that it is 
optimum to the specific technology in which the chip is 
fabricated. 

The Gmicro VLSI CPUs such as the Gmicro/200 and 
Gmicro/300 include coprocessor instructions. Pro¬ 
grammers see the FPU instructions the same way they 
see other CPU instructions. By connecting the Gmicro 
FPU, we add 50 floating-point-related instructions and 
19 floating-point-related registers to the Gmicro in¬ 
struction set, the register resource. Vendors producing 
the Gmicro family of VLSI chips have defined a 
coprocessor protocol common to the Gmicro family. 

The coprocessor protocol between the CPU and FPU 
determines much of an FPU’s performance. The speed 


of bare arithmetic hardware, such as the arithmetic 
logic unit or the multiplier, does not become the per¬ 
formance determinant so long as the chip is not 
equipped with a faster coprocessor protocol. Manufac¬ 
turers have implemented various kinds of handshake 
methods between the CPU and the extension. Some 
coprocessors scan the instruction stream, identify their 
own instructions, and activate themselves. Other 
coprocessors are mapped onto a special memory space 
so the CPUs can access them as slave processors. The 
protocol of the Gmicro/FPU and Gmicro VLSI CPUs 
are the latter type except that some special interface 
pins are added to decrease the number of bus cycles 
necessary to complete the coprocessor protocol. The 
number of bus cycles is a major determinant of the 
maximum performance of the FPU, since the arithmetic 
operations can now be performed in very few machine 
cycles. 

New computation algorithms. With advances in 
technology, vocabularies of the VLSI architecture 
gradually evolve, and the cost of each numerical opera¬ 
tion changes. As a result designers discard traditional 
algorithms in favor of more appropriate ones such as 
the Cordic (Coordinate Rotation Digital Computer) 
algorithm for calculating a wide range of elementary 
functions. Cordic’s implementation with a set of ad¬ 
ders, shifters, and read-only memory allows most of 
today’s VLSI FPUs to use the silicon area to best 
advantage. 

Instances of an algorithm inspired by a new hardware 
technology can be found in the Gmicro/FPU. Designers 
introduced a parallel multiplier consisting of a matrix 
of carry-save-adders, which multiplies two operands in 
two or three machine cycles. Previously, when imple¬ 
mented in a microinstruction loop, multiplication took 
30 to 70 machine cycles. The drastic reduction in the 
computation cost of multiplications makes room for 
new algorithms for IEEE square-root and elementary 
functions. The new algorithm is potentially faster than 
any known algorithm when fast multiplication is 
available. 


Requirements 

The design objectives for the Gmicro/FPU floating¬ 
point unit were set to: 

• fully conform to the IEEE floating-point standard, 

• provide adequate architectural support for the 
ANSI C draft proposal, 

• provide an efficient CPU interface as well as high- 
performance floating-point hardware, 

• provide elementary functions with the highest pre¬ 
cision, and 

• provide an entire set of mathematical library func¬ 
tions as instructions. 


June 1989 29 



TRONFPU 


Architecture 

The Gmicro/FPU can be used as a resource of the 
Gmicro/200. Figure 3 shows the extended register after 
the FPU is attached to the Gmicro/200. The FPU con¬ 
tains sixteen 80-bit, floating-point data registers (FRO- 
FR15) and three control registers: the FMCR (floating¬ 
point mode), FSR (floating-point status), and FQR 
(floating-point quotient). The registers function as 
follows: 

• FR0-FR15 registers store floating-point data; 

• the FMCR specifies trap enable/disable and round¬ 
ing modes; 

• the FSR stores condition codes, exception flags, 
and accrued exception flags; and 

• the FQR stores the quotient for modulo and remain¬ 
der instructions. 

Instruction set. The FPU’s instruction set and its 
basic formats appear in Table 1 and Figure 4. All 
instructions for arithmetic operations have only float¬ 
ing-point registers for destinations. Figure 4 displays 
the major instruction formats. Designers modeled the 
basic instruction format after the coprocessor instruc¬ 
tions in the TRON architecture for the VLSI CPU. Both 
source and destination operands have size specifiers, 
which give the unit an unusual architectural strength. 
The result is rounded to the precision of the destination 
operand size, exception signaled (if any), and stored in 
the floating-point data register. 

Figure 5 illustrates how, when variables are assigned 
in floating-point registers, an ANSI C compiler can 
generate faster code for the Gmicro/FPU than can be 



(a) 


obtained with conventional FPUs. The C-language 
source code in Figure 5a on p. 32 is cited as an example 
of expression evaluation. According to the ANSI pro¬ 
posal, the evaluation of the expression u = t * (x + y) + 
z takes the four steps shown in Figure 5b; in each step, 
the rounding precision is explicitly specified. Both 
source and destination operands have size specifiers, 
which enables the unit to conform to the ANSI C draft 
proposal even when the values reside in its floating¬ 
point registers. The earlier box contains the type of pro¬ 
motion rule that determines this precision. 


Floating-point data registers 

BitO 


Bit 79 

BitO 


W 

Bit 63 


BitO 

◄— 

W 

Bit 31 

-► 


FRO ; ! 

FR1 1 

FR2 ! 

FR3 ! 

fr4 ! ; 

FR5 | 1 

FR6 ; 

FR7 ; I 

FR8 ; [ 

FR9 ! 

FRIO ! : 

FR11 ! 

FR12 J ! 

fri 3 ; 

fri4 ; ; 

FR15 \ ; 


Floating-point mode control register 
FMCR 


Floating-point status register 


FSR 


Floating-point quotient register 


Figure 3. The TRON register set (a) and the extension of the register resource when the FPU is added to the 
Gmicro chips (b). 


30 IEEE MICRO 





























































A common tendency in the current floating-point 
chips is the precision change that occurs when floating¬ 
point registers are used in optimization. 4 Most of these 
chips convert results to the length of the registers—the 
longest data type a register can hold. Promotion of the 
floating (32-bit) variable to a long double (80-bit) 
variable occurs when the destination is a floating-point 
register. To observe the ANSI proposal’s requirement 
on conventional floating-point chips, we would have to 
generate the object code shown in Figure 5c, a succes¬ 
sion of 10 instructions. The .s and .d suffixes for the 
assembler mnemonic in this figure indicate single pre¬ 
cision and double precision. 

In Figure 5d we see the ANSI proposal for object 
code generation observed on the Gmicro/FPU. Here, 
only five instructions evaluate the expression, thereby 
reducing by 50 percent the amount of code necessary 
with a conventional FPU. 

Elementary function instructions. The ANSI C 
proposal also determines the range of mathematical 
functions. Mathematical functions defined in the 
header file <math.h> take double-precision arguments 
and return double-precision values. The tenet of the 
function definition is that the domain of the mathemati¬ 
cal function must match with the mathematical defini¬ 
tion. The Gmicro/FPU instructions conform to this 
proposal. No software envelope is necessary, and an 
instruction for an elementary function can simply be 
generated in line when the functions are called. The 
complete matching of the domain of the function is 
ensured. Table 2 on p. 33 lists the correspondence of the 
instructions and the <math.h> file. Future ANSI expan¬ 
sions are expected to include all operand precisions in 
elementary functions. This requirement can be met by 
having the FPU specify the source and destination size 
in the instruction. 

The table also shows the mathematical library func¬ 
tion of Fortran 77, which has complex numbers as a 
data type. Mathematical functions dealing with a 


Table 1. 

Gmicro/FPU instruction set. 

Operation 

Function 

Basic 

Addition, subtraction, multipli¬ 
cation, division, remainder, IEEE 
modulo, square root 

General 

Absolute value, negation, round to 
integer, truncate to integer, extract 
exponent, extract mantissa, gener¬ 
ate constant 

Elementary 

Sine, cosine, sine cosine simultan- 

functions 

eous calculation, tangent, arcsine, 
arccosine, arctangent, sine 
hyperbolic, cosine hyperbolic, 
tangent hyperbolic, arctangent 
hyperbolic, exponent base 2, 
exponent base 10, natural expo¬ 
nent, logarithm base 2, logarithm 
base 10, natural logarithm 

Graphics 

Vector inner product (up to eight 

support 

dimensions), clipping judgment 

Conditional 

branches 

Compare, test, conditional branch 

Data format 

Load floating-point number, store 

conversion 

floating-point number, load signed 
integer, store signed integer, load 
unsigned integer, store unsigned 
integer, load control register, store 
control register 

Operating 

No operation, internal reset, save 

system 

internal state, restore internal state. 

related 

load multiple floating-point 
numbers, store multiple floating¬ 
point numbers 


complex data type can easily be constructed by using 
the Gmicro/FPU elementary function instructions. 
However, this is out of our scope and not shown in the 
table. 


0 5 6 7 8 11 12 15 



(b) 


EXP 


EA 

Effective address field 

0P1.0P2 

Operation code 

Sx 

Source floating-format specifier 

Sy 

Destination floating-format specifier 

src 

Source operand specifier 

dest 

Destination operand specifier 

FRn 

Floating-point data register n 

EXP 

Effective address extension 


Figure 4. Major instruction formats in the Gmicro/FPU: memory-register (a) and register-register (b). All arithme¬ 
tic instructions fall under these categories, and only store instructions can store to memory. 

June 1989 31 






























TRONFPU 


main() 

{ 

register float x, y, z, u; 

register double t; 

{ 

u = t • (x+y)+z; 

} 

} 

(a) 


1) x + y is performed to infinite precision and rounded to float 
or single (32-bit) precision. 

2) t • [result of (1)] is performed to infinite precision and rounded 
to double (64-bit) precision. 

3) [result of (2)] + z is performed to infinite precision and rounded 
to double (64-bit) precision. 

4) [result of (3)] is rounded to single precision and stored in u. 

(b) 


sp: stack pointer 

frl 5 : x 

frl4 : z 

frl3 : y 

frl 2 : t 

frl 1 : u 


fmov.s 

frl 5, frO 

fadd.s 

frl 3, frO 

fmov.s 

fr0,@sp 

fmov.s 

@sp, frO 

fmul.d 

frl2 , frO 

fmov.d 

frO, @sp 

fmov.d 

@sp, frO 

fadd.s 

frl 4, frO 

fmov.s 

frO, @sp 

fmov.s 

@sp, frl 1 

10 instructions 


(c) 


sp : stack pointer 
frl 5 : x 


frl4 : z 


frl3 : y 
frl2 : t 


frl 1 : u 


fmov frl 5, frO.s 

fadd frl 3, frO.s 

fmul frl 2, frO.d 

fadd frl 4, frO.d 

fmov frO, frl 1 .s 

5 instructions 


(d) 



.s Single precision 
.d Double precision 


move x to frO 
add y and store in frO 
store result to round to single 
load expression x+y in frO 
multiply t with content of frO 
round the result to double 
load expression t • (x+y) in frO 
add z to frO 

round the result to single 
store the result to u 


; move x to frO (accumulator) 
; add y to frO, round to float 
; multiply t, round to double 
; add z, round to double 
; round frO store to u 


Implementation 

Here we describe a new, tightly coupled coprocessor 
protocol for the Gmicro/200-Gmicro/FPU system. One 
of the design objectives in this protocol is to minimize 
the CPU-FPU communication overhead to improve the 
total performance of the FPU. To achieve this goal, we 
judiciously reduced the amount of CPU, FPU, and 
memory bus transfers needed for instruction execution. 
We introduced the following schemes in the protocol: 

1) Some functional redundancies between the CPU 
and the FPU. The decoder and the microcode of both 
the Gmicro/CPU and the Gmicro/FPU store the 
coprocessor instruction and protocol sequence. Both 
chips “know” exactly what is to be done once the 
coprocessor instruction is determined. This capability 
significantly reduces the bus cycle count over that 
found in other FPUs, since no coprocessor instruction 
knowledge must be transferred from the coprocessor to 
the CPU. 

2) Protocol reduction. Coprocessor status pins 
(CPST) reduce the number of bus cycles in a protocol. 
The Gmicro/CPU can monitor the Gmicro/FPU’s status 
without having to invoke a bus cycle. 

3) Direct data transfer. The Gmicro/200’s bus con¬ 
trol circuitry directly transfers data between the mem¬ 
ory and the Gmicro coprocessor. Eliminating data trans¬ 
fers between the CPU and the coprocessor normally 
required for memory-coprocessor data transfer again 
reduces the number of bus cycles required for the 
protocol. 

4) CPU write-through cache support. The Gmicro/ 
FPU supports a write-through cache on future CPUs. 
Securing the data coherency between the Gmicro/ 
CPU’s internal cache and external memory becomes an 
easy task. 

5) Multiple coprocessor support. Up to eight 
coprocessors can logically connect to the CPU. In this 
way, the CPU can distribute the work load to several 
coprocessors, making overlapping operations on mul¬ 
tiple coprocessors possible. 

6) Fast floating-point conditional branch. Condi¬ 
tional branch tests on the coprocessor’s internal state 
take place using the coprocessor status lines, thereby 
making the operation faster. 

An example multiple-coprocessor system. Figure 6 
depicts a multiple-coprocessor system made up of 
Gmicro family chips. It contains four coprocessors 
including an FPU. A common clock serves both the 
CPU and the coprocessors as the time reference for 


Figure 5. Two operand size specifiers on a floating¬ 
point instruction conform to the ANSI C draft pro¬ 
posal very simply when the compiler allocates vari¬ 
ables in the floating-point registers. C language 
source code (a); its translation into a floating-point 
number according to the ANSI standard (b); object 
code on conventional FPUs (c); and the Gmicro/FPU 
optimization (d). 


32 IEEE MICRO 












Table. 2. 

Mathematical libraries and instructions for the Gmicro/FPU. 


ANSI/C 



ANSI/C 



proposal 

Fortran 


proposal 

Fortran 

Instruction 

math.h 

77/ANSI 

Instruction 

math.h 

77/ANSI 

fabs 

fabs 

ABS 

fsincos,fdiv 

_ 

COTAN 

facos 

acos 

ACOS 

fsinh 

sinh 

SINH 

fasin 

asin 

ASIN 

fsqrt 

sqrt 

SQRT 

fatan 

atan 

ATAN 

ftan 

tan 

TAN 

fatan.fdiv 

atan2 

ATAN2 

ftanh 

tanh 

TANH 

fcos 

cos 

COS 

floge 

log 

LOG 

fcosh 

cosh 

COSH 

flog 10 

log 10 

LOG 10 

fexp 

exp 

EXP 

fmod 

modf 

MOD 

fexp2 

frexp 

— 

— 

pow 

— 

fint,fintrz 

floor, cell 

AINT.ANINT, 

fmul 

* 

*, DPROD 



NINT 

fneg 

- 

SIGN 

fscale 

ldexp 

— 

fsqrt, fsincos, 

— 

SQRT 

fsin 

sin 

SIN 

fatan 




synchronized operations. The synchronized clock en¬ 
ables all of the processors to operate essentially as one 
processor. Each coprocessor contains input signals 
called coprocessor identifications, or CPIDs, to 
uniquely assign an identification for user software. The 
system initializes the CPID in the reset operation. All 
processors share the data bus, address bus, and part of 
the control bus. Notable control signals include bus 
accesses B ATO-2 and coprocessor status CPSTO-2. The 
CPSTO-2 and the B ATO-2 for each coprocessor are OR- 
wired electrically or logically. 


Bus access signals BATO-2. The Gmicro/CPU uses 
the bus access signals to inform the Gmicro/FPU of 
contents on the data bus. The CPU 

1) sends an operation’s content to the coprocessor, 

2) writes the operand data into the coprocessor, 

3) reads the operand data from the coprocessor, 

4) coordinates the direct data transfer from the 
coprocessor to the memory, 

5) coordinates the direct data transfer from the 
memory to the coprocessor, and 


BAT signals 
CPST signals 


OR wired ^ 


Address bus C 


Data bus C 


T 


CPID 
initialized to #1 


f CPID f 


CPID 
initialized to #2 


CPST 


CPST 


CPST 


CPST 


CPST 

Gmicro/CPU 


Gmicro/FPU 


Gmicro 


Gmicro 


Gmicro 



#1 


coprocessor 

#2 


coprocessor 

#3 


coprocessor 

#4 

CPID 


CPID 


CPID 


CPID 


CPID 


CPID 
initialized to #3 


J 


CPID 
initialized to #4 


CPID Coprocessor identification 
CPST Coprocessor status pin 


Figure 6. System configuration of a multiple FPU application. Coprocessor 
identification is input from external pins of coprocessors at the time the sys¬ 
tem is initialized. 


June 1989 33 








































TRON FPU 


CPU FPU 



Figure 7. Protocol sequence in the Gmicro/FPU when executing a coprocessor instruction that loads a single¬ 
format operand and completes an arithmetic operation. 


6) sends the coprocessor instruction address to the 
coprocessor. 

The BATO-2 signals along with a read/write signal 
from the CPU determine which one of the above bus 
access types are currently taking place. From this infor¬ 
mation, the Gmicro/FPU sends, receives, or processes 
data on the data bus for each bus access type. 

Coprocessor status signals. The CPSTO-2 signals 
report the coprocessor’s status to the Gmicro/CPU. The 
sampling timing of the CPSTO-2 signals is determined 
within the protocol. Normally, these signals are 
sampled on the second bus cycle in the protocol. The 
first bus cycle is the operation content transfer. In this 
case the coprocessor compares its internal state and the 
operation content to see if the operation can be proc¬ 
essed immediately and returns the CPSTO-2 signals. 
The status that a coprocessor could present over the 
CPSTO-2 signals include: 

1) Command acceptance. The operation required by 
the CPU is acceptable and immediately processed. 


2) Command error. The coprocessor does not recog¬ 
nize the operation. 

3) Coprocessor busy. The coprocessor cannot accept 
the operation, and therefore the CPU must redo the first 
two bus cycles in the coprocessor protocol. 

4) Coprocessor exception. The coprocessor detects 
an exception in the previous operations, and therefore 
the CPU must take an exception vector. 

5) Data transfer ready. The coprocessor is ready to 
accept or send data operands. This status is reported for 
the execution of a coprocessor instruction involving 
multiple operands. 

6) Condition true. The conditional branch test on the 
coprocessor state is true. 

7) Condition false. The conditional branch test on the 
coprocessor state is false. 

Basic types of protocols. A combination of the data, 
command, and address transfers on the bus cycles and 
status tests in a certain order form the protocol se¬ 
quence for a coprocessor instruction. Twelve kinds of 
protocols perform Gmicro/FPU instructions including 
internal state save/restore, multiple-register load/store, 


34 IEEE MICRO 


































































and floating-point arithmetic. The floating-point arith¬ 
metic operations fall into three groups: 

• an operation on the FPU’s internal registers only; 

• an operation using an operand(s) external to the 
FPU, CPU register, memory, or CPU cache and storing 
the result in the FPU’s internal register(s); and 

• an operation using an operand(s) internal to the 
FPU and storing the result in the resource external to 
the FPU, memory, or CPU register. 

A coprocessor protocol example. Figure 7 illustrates 
a coprocessor protocol for a Gmicro/FPU dyadic in¬ 
struction, an instruction involving two operands to 
produce the result in an operand. One operand resides 
in the memory and the other in the Gmicro/FPU’s 
register. This example helps to explain the coprocessor 
protocol. In executing this coprocessor instruction, the 
Gmicro/CPU and Gmicro/FPU perform the following 
activities: 

• Instruction decoded, operation content extracted. 
The CPU fetches an instruction and decodes it. If it is a 
floating-point instruction, the CPU extracts the infor¬ 
mation needed by the FPU from the coprocessor 
instruction. 

• Operation content transferred to the FPU along 
with the coprocessor ID. The FPU transfers the opera¬ 
tion content and command. When this bus cycle com¬ 
pletes, three of the address lines indicate the value of 
the CPID number to which this coprocessor instruction 
is sent. The coprocessors monitor the three address 
lines and compare the CPID number on the address 
lines with their ID numbers. The coprocessor whose 
CPID matches with the number acknowledges the 


command and asserts the CPSTO-2 lines. Other 
coprocessors designate their coprocessor status lines as 
three-state conditions. 

• The CPU coordinates the operand transfer from 
the memory to the FPU. The CPU asserts the bus 
control lines and the address lines and instructs the 
main memory to put an operand word on the data bus. 
At this point the FPU asserts the CPSTO-2 lines, and the 
CPU samples them. The FPU receives the operand on a 
data bus, and the CPU proceeds with the protocol if it 
detects the command acceptance status. If a coproces¬ 
sor busy status is detected, the CPU reverts to activity 
2. If another coprocessor status is presented, exception 
actions are taken. CPSTO-2 detection and the operand 
transfer occur simultaneously. If the operand transfer 
requires more than one bus cycle, either because the 
size of the operand is larger than 32 bits or because the 
operand is placed out of alignment with the memory, 
the bus cycles repeat until the operand transfer 
completes. 

• The CPU transfers the FPU instruction address to 
the FPU. The CPU transfers the floating-point instruc¬ 
tion address via the data bus from which the FPU 
receives the FPU instruction address. This step ensures 
the availability of the exception information necessary 
in case an error occurs at a later stage. When the FPU 
acknowledges the instruction address transfer, the 
protocol sequence terminates. 

Bus-cycle reduction. Figure 8 replicates the actual 
time chart used for the protocol when the FPU performs 
a floating-point arithmetic operation of the kind just 
mentioned: one operand of single-precision data for¬ 
mat residing in memory and one operand in floating- 



ACC Command accept 
FIA Floating-point instruction address 


Figure 8. FADD command-cycle timing (memory + FR). A clock cycle equals the machine cycle. The Gmicro/220- 
Gmicro/FPU system executes this protocol in 10 machine cycles. 


June 1989 35 






























































































TRONFPU 


CLK ■ 
CPID • 
CPST 

Bus-control signals 


D0-D31 < 


BCU Bus control unit 
ECU Execution control unit 
FCU Format-conversion unit 


BCU 


Protocol 

and 

pipeline 

control 


Three-stage command pipeline 

/ i 

r FCU 


Data 

aligner 


Format- 

conversion 

control 



# 


Data 

format- 

conversion 

unit 


\ 


# 


ECU 



Microcode ROM 


+ 

♦ 

1 


Data 

unit 

Constant 

Floating- jj 

Exponent 

ROM 

point 

arithmetic 

unit 


register 

file [ 

Mantissa 

arithmetic 

Multiplier 


unit 



Five-stage microinstruction level pipeline 


Figure 9. Internal structure of the Gmicro/FPU. The BCU, FCU, and ECU units constitute the active operational 
blocks, and another piece of logic arbitrates them. 


point register with the result stored in a floating-point 
register. This operation takes 10 machine cycles to 
complete the handshake. 

In this specific protocol the coprocessor status line 
saves one bus cycle, and direct data transfer between 
the memory and the Gmicro/FPU saves another bus 
cycle. Thus we save a total of two bus cycles from the 
original five—a 40 percent reduction. This amount of 
reduction is entirely effective since 10 machine cycles 
are also consumed by the floating-point addition for a 
single-precision operand. The new protocol scheme 
matches bandwidths between the high-performance 
floating-point arithmetic hardware and the protocol. 

Other aspects. The FPU can also be used with a CPU 
without Gmicro coprocessor protocol when it is con¬ 
nected as a peripheral. The CPU emulates the transfers 
in the protocol sequence of coprocessor operation with 
the exception that the coprocessor status is read from a 
special register address. For future Gmicro/CPUs with 
nonwrite-through data caches, the Gmicro/FPU can 
retain the data coherencies between the external mem¬ 
ory and the CPU cache. When the correct data is in the 
memory, the CPU can activate the bus cycle from the 
memory to the FPU. When the correct data is in the 
CPU’s data cache, the CPU can write the data into the 
FPU. The FPU does not differentiate between the data 


access methods, carrying out the protocol in the same 
way wherever the operand originates. 

Internal FPU structure. Figure 9 shows the Gmi- 
cro/FPU’s internal structure. The three main elements 
are the: 

• Bus control unit (BCU), which is responsible for 
handshaking with the main processor. The unit carries 
out bus cycles, generates the status of the coprocessor, 
and processes the coprocessor protocol. It also carries 
out the necessary command/data/status transfers be¬ 
tween the main processor and the FPU. 

• Format conversion unit (FCU), which converts all 
operands sent from the main processor or the memory 
to the internal floating-point data format before the 
actual arithmetic operation starts. In the FPU one inter¬ 
nal data format represents all floating-point data. Once 
the store operation of the floating-point data completes, 
the data expressed in the internal format are again 
converted into the external formats, as defined in the 
IEEE standard. 

• Floating-point execution unit (ECU), which con¬ 
sists of the microcode ROM and four other units: data 
type, exponent arithmetic, mantissa arithmetic, and 
multiplier. The IEEE floating-point data format ex¬ 
presses those values such as infinity, zero, and not-a- 


36 IEEE MICRO 

































number. The arithmetic operations involving these 
special values bypass normal sequencing and occur in 
the data unit. The sign bit also resides in this unit and is 
manipulated here. The exponent arithmetic unit proc¬ 
esses addition/subtraction operations for the exponent 
calculations. The addition, subtraction, shifting, divi¬ 
sion, and other necessary operations are performed on 
the mantissa of floating-point values in the mantissa 
arithmetic unit. The quarter-size flash multiplier (33 
bits X 33 bits) performs high-speed multiplication 
operations on the mantissa of the floating-point num¬ 
bers. This multiplier unit is used frequently when cal¬ 
culating square roots, elementary functions, vector 
inner products, as well as multiplications operations. 

Three-stage pipeline. In addition to the coprocessor 
interface with the reduced bus cycle overhead, the 
Gmicro/FPU controls a three-stage pipeline. Often the 
entire protocol processing becomes hidden in a float¬ 
ing-point arithmetic operation. Figure 10 illustrates 
how the three-stage pipeline control increases the 
throughput of the FPU by allowing command or data 
fetches in the BCU, data conversions in the FCU, and 
floating-point arithmetic operations in the ECU to be 
performed concurrently. 

Excluding the instruction fetch and effective address 
calculation by the CPU, a floating-point instruction 
execution in the FPU can be divided into the following 
three phases: 

• the BCU fetches the command and operand, the 
BCU fetches the FIA (floating-point instruction ad¬ 
dress); 

• the FCU receives the operand from the BCU and 
converts IEEE format data into internal data format; 
and 


• the ECU performs floating-point arithmetic. 

By overlapping the operations for different floating¬ 
point instructions, each unit operates in parallel. The 
pipeline arbitration occurs in a finite-state machine that 
samples the state of each unit. The status of each 
instruction at any one time (see Figure 10) in the 
pipeline is as follows: 

• the first instruction is in the ECU processing phase; 

• the second instruction is in the FCU processing 
phase; and 

• the third instruction is in the BCU processing phase. 


Elementary functions. Designers know that their 
goals of high performance and high precision are often 
at odds with each other when designing floating-point 
units. Repeating iterations of converging algorithms 
ensures higher precision, yet the use of the same com¬ 
putational algorithm takes longer to execute. In pursuit 
of our design objective of higher precision in elemen¬ 
tary functions, we attempted to streamline the Cordic 
algorithm to avoid lowering performance for precision. 
In realizing the elementary functions, we 

1) adopted the Cordic algorithm because it lets us 
realize the entire range of elementary functions with a 
simple set of hardware. 

2) added the adjustment mechanism to, for each ex¬ 
ponent of the argument, always select the best set of 
angle constants used for rotations from a constant pool. 

3) corrected errors after applying the Cordic algo¬ 
rithm to obtain higher precision with decreased Cordic 
iteration count. 


CPU 


First communication 


Second communication 


Third communication 




\\\\v\v\\v\\v\v\vv\\\v> 


BCU |Command^ Operand^ FIA ^Comman^|^peranT|^^Ti^^^^Comma'nd 


FCU 


ECU 


Format conversion 


) 


Format conversion 


Floating-point arithmetic 



Figure 10. Pipeline timing in the Gmicro/FPU showing the CPU as it executes three consecutive memory-to- 
register floating-point arithmetic instructions. The length of each stage differs from one another. The format- 
conversion stage can start from the middle of the protocol. A state machine, which schedules each pipe to 
near-maximum efficiency, manages the pipeline. 


June 1989 37 
































TRONFPU 


The target angle A 


(b) 


■ tan' 1 (2'°) 


cos A = cos(A A ) • cos (A ') - sin(AA ) • sin( A ' 
- cos (A ') - AA • sin(A ') 
for small A A 


c 


Start 


N times 


-Cordic iteration 


3 


* /+ l = x- sign ( z.)> y ; -2' 1 
y 4 t = y , + sign ( z,)-x, -2' 1 
z M = z . - sign ( z ( .)* tan ■ , (2'') 





cosZ = X n -Z n * Y n 

sin Z = Yn + Z n • Xn 


(c) 


c 


End 


3 


Error A A 



Cordic’s 

convergent 
angle 


Figure 11. Applying the Cordic algorithm: the coordi¬ 
nate-rotation method (a); the error-correction method 
for a trigonometric function (b); and convergence im¬ 
provement after replacing the last half of the Cordic 
iterations with a faster method (c). 


In the following 
paragraphs we use the 
cosine function as an 
example to demon¬ 
strate how we imple¬ 
mented the Cordic al¬ 
gorithm. 

Original Cordic al¬ 
gorithm. The Cordic 
algorithm invented by 
Chen 8 was expanded 
by Voider 9 and 
Walther 10 to be applied 
for the entire range of 
elementary functions. 
The algorithm’s very 
wide range of functions 
even permits a multi¬ 
plication to be per¬ 
formed with Cordic 
hardware. In the Gmi- 
cro/FPU, however, we 
use a separate multi¬ 
plier for multiplica¬ 
tion. The Cordic 
method obtains the so¬ 
lution by repeating 
vector rotations in the 
complex plane and 
eventually lets the vec¬ 
tor converge to the 
angle corresponding to the argument for which the 
function value is searched. 

Figure 1 la illustrates the Cordic algorithm process 
of simultaneously obtaining cosine and sine for the 
angle A. Starting from the vector V 0 (l/C 0), we create 
another vector V (a , y ) by rotation; we then form a 
perpendicular triangle with one vertex with the angle 
tam 1 (2 °) and one side as V 0 . Similarly, we obtain 
V,(a„ y 2 ) from V,, and V,(a 3 , y } ) from V 2 , and so forth 
to make the vector V. approach the target angle A. We 
chose the direction of rotations in each iteration to 
make the vector closer to the target angle A. We obtain 
the coordinate of the vector after this rotation by the 
following iterative equations: 


= a. - sign(r) * y. * 2 ' 
, = y. + sign(_v).v * 2"' 


sign(z ) * y. * tan‘‘(2~') 


( 1 ) 

( 2 ) 

(3) 


where we have chosen the plus and minus operators to 
converge to the target angle A. AA, the difference be¬ 
tween the target angle and the vector V ;i , is translated 
into the error in the final solution. The maximum value 
of AA, max(|AA|), becomes smaller by one bit with each 
rotation and eventually becomes less than 2~" after n 
iterations: 


38 IEEE MICRO 









































max(| A/V|) = 2" 


(4) 


To converge the rotational vector V to the target 
angle A with 64-bit precision, the same as that of 
extended double-precision format, the algorithm re¬ 
quires 64 iterations. 

Cordic becomes faster. As the above argument 
shows, in the Cordic algorithm, we obtain only one 
significant bit of rotational angle per rotation. To ob¬ 
tain the solution more quickly, we leave the Cordic 
iterations after a certain number of rotations and obtain 
the cosine for the angle of a rotational vector V B , A'. 
This vector occurs in the vicinity of the target angle A 
with the difference AA. AA being sufficiently small, the 
following identity holds: 

cos (A) = cos(AA) * cos(A') - sin(AA) * sin(A') 

= cos(A') - A/4 * sin(A') (5) 

for small AA 

Using the identity in Equation 5, we obtain cos(A). 
The Cordic algorithm gives cos(A') and sin(A ')• We get 
AA from simple subtraction. Thus we can effectively 
replace the remainder of the Cordic iterations by a 
subtraction and a multiplication, which take very little 
time to execute with the Gmicro/FPU’s multiplier. 

Implementation, performance , precision. The Gmi- 
cro/FPU repeats the Cordic iterations 32 times and 
performs the error-correction operations seen in Figure 
1 lb to obtain the cosine function. For all other func¬ 
tions including sine, tangent, and hyperbolic functions, 
similar error-correction equations exist, and with these 
the Cordic iterations repeat only 32 times. 

Figure 1 lc summarizes the comparison of the vital 
statistics of the original Cordic algorithm and the 
improved Cordic algorithm by examining the cosine 
function with extended double-precision (80-bit) for¬ 
mat. The Gmicro/FPU can execute one Cordic iteration 
in three machine cycles. By reducing the Cordic itera¬ 
tion count from its typical 64 times to 32 times, we save 
96 machine cycles. The error-correction operation 
consisting of subtraction and 64 X 64 multiplication 
(which takes one machine cycle and 12 cycles respec¬ 
tively) can be overlapped with other operations. Thus 
the total microcode of the new algorithm takes 111 
machine cycles as opposed to the conventional Cordic 
algorithm that takes 198 machine cycles. We achieve a 
performance improvement of 40 percent. See Table 3. 

The preliminary evaluation of the precision tells us 
that the cosine function has, in most of its domain, 58 
to 62 significant digits out of the 64 mantissa digits of 
extended double-precision data. Most libraries or float¬ 
ing-point processors have 53 to 58 significant digits. 
The difference is accounted for as follows: 

• the Gmicro/FPU has 66-bit-long angular constants 
for Cordic operation, whereas most software libraries 
typically have 64-digit angular constants; and 


Table 3. 

Cordic algorithm speed improvement 
on the cosine function. 


Conventional 

Gmicro/FPU 

Items 

Cordic 

Cordic 

Cordic 

iteration count 

64 

32 

No. of 

0 

1 

multiplications 
No. of 

53-58 

58-62 

significant 

digits 

(out of 64) 

(out of 64) 

Latency times 
(machine cycles) 

198 

111 


• the rounding-off error tends to accumulate with 
repetitive additions toward the end of the Cordic 
iterations. Thus, from a precision standpoint, the new 
algorithm demonstrates a satisfactory result. 

Square roots. Square-root derivation is another area 
in which we can exploit the existence of the multiplier. 
With the Newton-Raphson algorithm we reach square 
roots extremely fast when fast multiplication is avail¬ 
able. However, the IEEE floating-point standard sets a 
stringent requirement on the precision of the division 
and square-root solutions, which the N-R algorithm 
does not fulfill. A so-called precise solution is required 
for these operations. The IEEE standard states that the 
add, subtract, multiply, divide, square-root, and re¬ 
mainder operations “shall be performed as if it first 
produced an intermediate result correct to infinite pre¬ 
cision and with unbounded range, and then rounded 
that result . . . 

It has been believed that the above requirement 
essentially determines the computation algorithm for 
square-root and division operations. Most manufactur¬ 
ers today use the pencil-and-paper algorithm for these 
operations, which essentially obtains one significant 
bit of solution per iteration and which provides a set of 
intermediate results from which an ALU with infinite 
precision is effectively simulated. 

Instead, the Gmicro/FPU uses the new algorithm to 
provide an IEEE square root with a smaller number of 
iterations. The FPU’s N-R algorithm obtains a square- 
root solution in a certain neighborhood of the IEEE 
square root, which we call a pseudo square root. (See 
the following explanation.) It then reconstructs the 
intermediate result of the pencil-and-paper algorithm 
from the pseudo square root and begins the pencil-and- 
paper algorithm for the last several bits to obtain the 
IEEE square root. 


June 1989 39 







TRONFPU 


N times 


( Start ) 


Newton-Raphson 

algorithm 


(a) 


sp*— 1 - B ' I 1 '/*"] 


X, +1 =X/.(3-B - (X ,•) 2 ) / 2 


B •X N 

( End ) 


i—Subtracter ■ 


Transfer 


£ 


z =3-y 


Transfer 


—Multiplier ■ 


i 


Y=B-(X,) 2 


X j+i = X i • Z 12 


(b) 


15 machine cycles a loop 


.Improved multiplier. 




Y = B • (X / 


f 



X i+1 = X i • Z 12 

▼- 


Z = lnvert(T ) + 1 


(c) 


12 machine cycles a loop 


Assuming ^ 

• 3 = 11.0000 

-► 3-y = 01.0001 binary 

Y = 1.1111 bi nar y 
(d) 

• Invert (y) + 1 

= 0.0000 + 1.0000 = 01 .oooo binary 


Figure 12. Square-root calculation. This technique for simplifying the Newton-Raphson method (a) saves three 
machine cycles in each iteration. The faithful evaluation in (b) is reduced further in (c). An example of 3 - Y= 
Invert! Y) + 1 (d). 


Figure 12a shows the method of square-root extrac¬ 
tion using the N-R principle. The N-R algorithm can 
obtain the reciprocal of the square root from an appro¬ 
priate value by repeating Equation 6. 

X. + | = X. * [3-fi * (X.) 2 ]/2 (6) 

where B is the source operand for the square root. To 
make the convergence faster, we look up in the ROM 
table an approximate value with 8-bit precision. With 
only three N-R iterations shown in Equation 6, we reach 
a result with 58-bit precision. 

Shortening the estimating time. To reduce the amount 
of machine cycles per N-R iteration, we use the follow¬ 
ing techniques. Figure 12b shows that in an N-R itera¬ 
tion two data transfers between multiplier and sub¬ 
tracter are necessary, and that they cause the overhead 
of almost 30 percent of the machine cycles. We slightly 
tampered with the algorithm and replaced the subtrac¬ 
tion operation with a complementing operation, which 


we show in Figure 12c. This operation can be realized 
with a small amount of hardware. The expression 3 - B 
*(X.) 2 is very close to a [one’s complement of B * (X.) 2 ] 
+ 1.0. An error with the weight of one LSB (least 
significant bit) creeps in, yet it can be proven that its 
effect is essentially negligible. The subtraction 3-81 
(X.) 2 can be achieved only by inverters installed in the 
multiplier. This improvement reduces the overhead by 
30 percent and the N-R iteration completes in 12 cycles. 

Based on a pseudo solution obtained by the N-R 
method, the LSBs of the mantissa are recovered, and 
the classical restoring method of the square-root algo¬ 
rithm starts near the LSB. When the iterations of the 
restoring method complete, intermediate values are 
collected and a sticky bit is determined. The rounding 
operation yields the result required by the IEEE float¬ 
ing-point standard. 

Figure 13a illustrates how the pencil-and-paper algo¬ 
rithm is actually restarted. The pseudo square root 
obtained by the N-R method is in the neighborhood of 
[pRoot - E , pRoot + E n ], where the real square root 


40 IEEE MICRO 





























































































Answer obtained by the 



The intermediate value formed before the 
pencil-and-paper algorithm is initiated 


lies. The [pRoot - E , pRoot + E ] neighborhood is 
again a subset of some interval within which the IEEE 
square root can be determined by the pencil-and-paper 
method. Figure 13b shows the Pascal-like outline of the 
conventional pencil-and-paper algorithm to obtain the 
IEEE square root. Figure 13c shows how its shortcut 
reconstructs the intermediate values for the pencil-and- 
paper method from the pseudo root obtained by the 
N-R algorithm. The pencil-and-paper algorithm starts 
from near the LSB. 

Exception handling. One of the areas in which 
programming techniques cannot conceal the shortcom¬ 
ings of hardware is that of exception traps. When the 
function of an exception trap is realized with software, 
the overhead is enormous and visible. The Gmicro/FPU 
provides trap facilities in all operating modes, and no 
information is lost in a floating-point arithmetic excep¬ 
tion, relieving programmers from any apprehensions 
about exception handling. The FPU 

• contains a command pipeline that does not alter the 
command stream. Even though the processing is done 
in parallel, the sequential execution model is retained. 
Programmers do not have to pay attention to pipelining 
unless an exception trap occurs. 

• retains the information such as the instruction 
codes, instruction address, all operands that were used 
in the instruction, and exception flags. The user can 
extract the flags with the floating-point-state save in¬ 
structions FSAVE and FSTC. From the point at which 
the exception occurred, the user can deduce the cause of 
the problem. 

• retains the state in all pipes of the pipelines when 
an exception occurs. The FPU also retains all instruc¬ 
tions in process. Using this information, programmers 
can fix the problem or even restart the entire program 
from the point of exception after fixing some of the 
intermediate values. 


function p«njilAndPapex£qx;areRobt (rX) ; 

) 

This is s classic method to obtain a square root used in most IEEI conformi.no 
floating-point arlthmet_cs. The parameter rX lies in the interval (1.0, 2.0) 
to simplify the argument. The indices are asigned for mantissa bits in the 
following manner : 

0 -1 -2 - (q-1) -Nt-1 N -N-l 


* binary point 

> 

function aearchftoot (IRoot, 1 Residue, k) ; 

{ 

searchRoot add relid -k bit to square root solution and return result. N+l 
Recursive invocations yield N*2 bits of solution. 

) 

begin 

if ik - M ♦ 1) 

then if (iResidue - 0) searchRoot iRoot 
else begin 

searchRoot :■ iRoot * 2 -N_1 < 1 

iResidue newResi due(iRoot, iResidue); 

end 

else if (root is in lower half of the interval) 

then searchRoot :» searchRoot (iRoot, iResidue, k*l) 

else searchRoot searchRoot (iRoot *■ 2-<*-!', iResidue, k-»l)r 

and; 


begin 

iRoot 1.0; 

rRoot searchRoot (iRoot, 0.0, 11; 

penc il An dPaper Square Root round(rRoot): 

end; 


(b) 


function hybridSquareAoct<rX); 

( pRoot : pseudo root obtained by Newton-Raphson algor:thn 

iRootl : a candidate intermediate value to restart pencil /paper met.hoa 
iRoct2 : a candidate intermediate value to restart pencil/paper method 
1X1 : square value cf iRootl 
1X2 : square value of iRoot2 

rRoot : root solution with ground and sticky bits 


function searchRoot(IRoot, iResidue, k); 

( detezxir.es one more valid bit -k of rcot and return result 


function newtcnRaphson(rX); 

{ returns peeudc solution pRoot lying in neighbourhood 

(rRoot-tn, xRoot+tn) of real solution rRoot (IpRoot - rRoct' S En) . 

An integer quantity q used in other part of the program is an integer 
such that e < 

I 

function clearBits(intValue, 1, ml; 

{ clear bits 1 through m of intvalue 


begin 

pRoot :■ newt onRaphson(rX); 

IRootl :• clearBits(pRoot, -p, -n-1); 

HQ IRootl * iRootl; 
iResidual rX - 1X1; 
if (iRetiduel - 0) then rRoot :« IRootl 
elae if (iRrsiduel > 0) 
then begin 

iRoot2 iRootl ♦ riq-D; 

1X2 iRoot2 • iRoot2; 
iReeidue2 rX - lxi; 

1* (iResidue2 - 0) then rRoot :- iRoot2 
else if (IResidue2 > 0) 

then rRoot :• seaxchRoct (iRoot2, iResldue2, q) 
else rRoot :■ searchRoot (iRoctl, iRasiduel, q); 
end 

else begin 

iRoot2 :« IRootl - 2-(q-D; 

1X2 iRoot2 * iRoot2; 

iResidue2 rX - ixi; 
if (residue2 - 0) then rRoot iRoct2 
else if (resldue2 < 0) 

then rRoot :» searchRoot (iRoot2, iResidue2, q) 
else rRoot searchRoot(iRootl, iResiduel, ql ; 
end; 

hybiidSquareRoot :■ round(rRoot); 
end; 


(c) 


Figure 13. Hybrid algorithm to obtain an IEEE square root. 
Before starting the pencil-and-paper method (b), one must 
make the interval in which the real square root lies (a) the 
subset of the interval of convergence. The hybrid square- 
root algorithm used in the Gmicro/FPU (c). For a machine 
with a parallel multiplier, a square root can be obtained 
faster. 


June 1989 41 


Fortran to generate faster code. 


determining the system performance. 


June 1989 43 


Each processor is formed as a pipelined configuration 
of the five-chip set. When process locality is assumed, 
slower off-chip performance is not crucial to overall 
system performance because inter-chip communica¬ 
tion occurs infrequently. 


form a processing pipeline through which data travels 
in a packet format (see Figure 4a). Each chip can be 
used repeatedly to build a PE. The joint and branch unit 
also connects PEs in the system. The photo in Figure 4b 
shows a prototype of the single-board, data-driven 


June 1989 49 









































TRONFPU 


Data-driven microprocessor 


nnf O rl t-iuo yi _ _ As s how n in Fig ure. la, the Manchesl er dat a flo w 


Data-driven microprocessor 






(b) 

Figure 5. Chip photomicrographs of the functional 
processor (a) and queue buffer (b). 


computer. Figures 5a and 5b contain photomicrographs 
of the functional processor and queue buffer chips. 
Table 1 describes the chip set. 


Figures 6a and 6b provide the formats for input and 
operation packets. Input packets come from the off- 
chip area and include the resultant packets generated 
from the functional processor. Operation packets are 
sent from firing control to the functional processor. 
Each packet is organized in a two-word format with a 
header word and a tail word. A selection code field in 
the header word contains such information as packet 
type (that is, whether it is an initializing packet for 
program and data loading or an execution packet) and 
physical destinations (on-chip or off-chip) of the 
packet. A color/generation identifier indicates the 
environment or context to which the packet belongs. 
The most significant bit of the identifier denotes 
whether it will be used as a color or generation identi¬ 
fier. A destination-node identifier serves as part of a 
keyword in the matching-memory access as well as an 
input address of the cache-program store. Depending 
on the retention of the color/generation identifier, 
multiple sets of the input data can share an identical 
function/program. 

Figure 7 provides examples of the concept of mul¬ 
tiple-generation data processing and concurrent proc¬ 
essing of a common function. Packets belonging to 
other generations are processed concurrently, which 
could lead to the possibility of inconsistent results. 
However, the program executes consistently because 
the firing control fires only when two packets have the 
same generation identifier. Unlike the static architec¬ 
ture in which the data process executes only for a single 
color/generation identifier, this type of dynamic archi¬ 
tecture allows multiple packets of different color/gen¬ 
eration identifiers to occupy the same primitive node 
concurrently. By exploiting this architectural feature, 
each program/function’s parallelism can be superim¬ 
posed on the processor, leading to an efficient utiliza¬ 
tion of resources. Even if a program is not highly 
parallel, effective parallelism in the processor is the 
product of the number of parallelisms times the number 
of concurrently applied generations. This process re¬ 
sults in efficient pipeline processing because the maxi¬ 
mum performance is obtained when all pipeline stages 
are filled with concurrently activated data. 

To improve the packet-flow rate through the pipe¬ 
line, each function block further subdivides into sev¬ 
eral pipeline stages. An input packet enters the joint and 
branch chip and merges with the packet stream in the 
circular pipeline. Cache-program store updates the 
packet header after fetching the next operation code 
and destination address. In the firing-control chip, with 
the color/generation code and destination address as 
keywords, the packet waits for its corresponding part¬ 
ner. Consequently, the firing control can be viewed as 
an associative memory in which two packets that have 
an identical keyword associate with one another to 
generate an operation packet. The matched operation 
packet is then sent to the functional processor, in which 
the pipelined operation proceeds according to the op- 


50 IEEE MICRO 














































































Table 1. 

Description of the five-chip set. 

Chip 

Performance 

Pin 

count 

Transistor 

count 

Function 

Functional 

40 Mflops 

208 

85,000 

Arithmetic operation on 32-bit 

processor 




data (floating-point/integer) 

Queue buffer 

20 MOPS 

208 

20,000 

Buffer memory 

Joint and branch 

20 MOPS 

180 

13,000 

Packet-transmission control 

Firing control 

20 MOPS 

208 

450,000 

Matching operation of operands 

Cache program 

20 MOPS 

208 

400,000 

Program memory store 


H 

T 

SL 

CG 

OP 

DN 

41 

35 


25 

19 

0 

H 

T 

O 

V 

F 

DATA 


41 

40 


36 


4 



(a) 


SL 


CG 


OP 


DN 


73 67 


57 51 


32 


H 

o 





T 

V 

FI 

DATA1 

F2 

DATA2 


73 72 68 


(b) 


36 32 


CG Color/generation identifier 
DATA Operand data 
DN Destination node identifier 
F Carry flag 
HT Head/tail flag 
OP Operation code 
OV Overflow flag 
SL Selection code 

1 Header word 

2 Tail word 


(a) 




Context 

Environment from which a common function 


is called 

i 

Generation identifier 

INC 

Increment operation performed on the operand 

m,n 

Color identifiers 


Figure 6. Organization of the input (a) and operation 
(b) packets. 


Figure 7. Illustration of the color/generation identi¬ 
fier's function in multiple-generation data process¬ 
ing (a) and concurrent processing of a common func 
tion (b). 


June 1989 51 














































Data-driven microprocessor 


eration code. The queue buffer absorbs any fluctuation 
in the packet stream. A packet entering the joint and 
branch chip after going through the queue buffer either 
continues to circulate within the pipeline or exits from 
it depending on the value of its selection code. Each lap 
around the circular pipeline corresponds to the execu¬ 
tion of one operation. 

This dataflow processor extensively utilizes an elas¬ 
tic-pipeline processing scheme, which we later discuss 
in detail. Its floating-point processing unit achieves a 
peak performance of over 40 million floating-point 
operations per second (Mflops). 9 We subdivided this 
function into 12 pipeline stages to aid performance. 

Because data is interdependent in conventional se¬ 
quential processors, an interlocking phase and pipeline 
flushing frequently occur between pipeline operations. 
This phase recovers the previous status when a Branch 
instruction is actually executed. Therefore, as the 
number of pipeline stages increases in conventional 
processors, the cost and complexity of pipeline control 
also increase rapidly. For this reason, subdividing the 
pipeline function beyond five to seven stages for se¬ 
quential processors does not improve performance. 


Data-driven processors, however, do not concur¬ 
rently activate operations that are mutually dependent. 
(Mutual dependency in this case means the inability to 
generate one of two packets until the other packet has 
completed execution.) The firing principle permits 
unordered execution of the fired operation. This proce¬ 
dure allows execution to be carried out without any side 
effects once the operating packet is produced in the 
firing control. 


VLSI hardware 

A notable feature of our data-driven processor is that 
each type of chip in the five-chip set employs a unique 
processing structure: a pipeline with an elastic data- 
buffering capability. 1011 Figure 8 shows the structure of 
the elastic-pipeline concept. The operation packet 
consists of operation code and operand data. An opera¬ 
tion code is processed by the decoders placed at each 
stage of the pipeline and determines the function of the 
hardware primitives located between data latches. The 
latches store the resultant packet until the succeeding 


Operation 



Figure 8. Structure of the elastic pipeline. 


52 IEEE MICRO 














































































































stage becomes vacant. The decoder predecodes the 
operation code at each pipeline stage to minimize de¬ 
lays. The number of stages depends on the complexity 
of the function. Data transfer between the latches uses 
the handshake mode of transfer by exchanging the 
SEND and Acknowledge signals between successive, 
self-timed, data-transfer control circuits. This method 
achieves a self-timed circuit free of any system clocks. 
Delay elements are purposely added to the SEND sig¬ 
nal line to guarantee an appropriate data-processing 
time through the logic circuit between the latches. 

The elastic pipeline provides the following charac¬ 
teristic features to a data-driven microprocessor. 

Elasticity. Because of the elastic mode of data trans¬ 
fer through the pipeline, data latches alternatively hold 
data during a data transfer or empty it to flow continu¬ 
ously through the pipeline. On the other hand, if the 
data begins to congest the pipeline, the empty latches 
between the data are “squeezed together” when the 
empty pipeline stages fill with accumulating data. 
These latches then act as data buffers (see Figure 9). 
This buffering capability is also favorable for the data- 
driven processor because the production rate of the 
operation packets, which contain operand pairs, usu¬ 
ally fluctuates around an average. 

Noise reduction. Due to simultaneous switching of 
all the transistors in clock-synchronous systems, a peak 
current demand causes unfavorable noise along the 
power distribution lines both on and off chip. This peak 
demand may introduce disruption of the next clock 
cycle due to a logic hazard. This hazard is created by the 
excessive inductive noise during a sudden surge. The 
problem becomes serious as the clock frequency in¬ 
creases. In the elastic-pipeline scheme, however, tran¬ 
sistor switching is determined by the amount of hand¬ 
shaking that occurs distributively along the elastic 
pipeline. Obviously, this procedure smooths out the 
peak current demand and significantly decreases induc¬ 
tive power-line noise on a VLSI chip. 

Clock-skew-free design. Along with the inductive 
power-line noise problem, clock skew also becomes a 
crucial design problem at increased clock frequencies. 
Some conventional processors centrally control com¬ 
munication between each function block through a 
long, common, passive-metallic bus. In this case, the 
skew between the clock and control signals can grow 
significantly. This skew substantially limits the num¬ 
ber of logic stages that can safely be placed between the 
latches. In the elastic-pipeline processor scheme, func¬ 
tion blocks are distributively controlled by neighboring 
function blocks. Asynchronous communication be¬ 
tween successive stages is completely localized. Con¬ 
sequently, the scheme is entirely free from the global 
skew problem encountered in conventional processors. 
In terms of logic-design tasks, this means that the 



-*■-Stagnating- 


Q Pipeline stage 
# Data (packet) 


Figure 9. Elastic data in flowing (a) and congested (b) 
states. 



Figure 10. Self-timed, data-transfer control circuit. 


designer is freed from clock-distribution problems. 

Figure 10 shows an example of a self-timed, data- 
transfer control circuit, which is a key component in the 
elastic pipeline. 12 

Figure 11 on the next page shows a test chip used for 
verification and evaluation of the control circuit. The 
test chip has a data throughput rate of well over 
50,000,000 words per second. Figure 12 demonstrates 
that it takes 19.52 nanoseconds to transfer one word, 
which corresponds to one cycle of the latch-enable 
SEND signal. One of the data bits output from the latch 
shows the stabilized transfer. Since this chip is clock 
free, the data-bit signals swing according to the SEND 
signal. In conventional clock-synchronous circuits, the 
SEND signal would correspond to the clock. 


June 1989 53 





































Data-driven microprocessor 


This high data rate occurred even though the chip 
was fabricated by a rather conservative, 1.3-microme- 
ter, complementary metal-oxide semiconductor 
(CMOS) process. 



Figure 11. Test chip with an elastic-pipeline structure. 



Figure 12. Data-throughput measurement in a test chip. 


The data-transfer rate for the self-timed control cir¬ 
cuit is shown by 

Rate = 1 /(t F + t B ) 

where t F is a forward-propagation delay of the SEND 
signal and t g is the backward-propagation delay. To 
improve the transfer rate while keeping the processing 
delay time constant at each stage, it is necessary to 
minimize t g . We propose an improved transfer-control 
circuit as shown in Figure 13 to minimize the t B .'° 

The elastic-pipeline scheme applies not only to the 
ALU but also to various functional blocks in the data- 
driven processor. One example is the pipelined firing- 
control configuration shown in Figure 14. For each 
incoming result packet the lower four bits in the color/ 
generation identifier and the lower six bits of the desti¬ 
nation field are concatenated to yield a hashing address 
to the firing control. A presence bit of the read-out data 
in matching memory indicates whether any result 
packet has already been written in the address. If the 
result packet’s corresponding partner is not found, the 
input result packet is stored in the same address as the 
read-out hashing address and turns on the presence bit. 
When the partner is found, an operation packet contain¬ 
ing a pair of operands is generated by merging the two 
result packets. 

In the firing control, we employed the pipelined- 
processing scheme at two different levels. At the macro 
level, the firing control subdivides into three units: 
prematching, matching memory, and packet assembly. 
In the prematching unit, the comparator processes tag 
information (that includes color/generation and desti¬ 
nation-node identifiers) in two simultaneously input 
packets. This process avoids duplicated, concurrent 
write access onto the same address from two input 
ports. The matching-memory unit has a two-input/two- 
output multiport memory and can execute two queries 
simultaneously. In the packet-assembly unit, an asyn- 



Figure 13. Advanced self-timed, data-transfer control circuit. 


54 IEEE MICRO 


































































chronous arbitration circuit merges the upper and lower 
packet streams into one output packet stream. 

Each macro-level section further subdivides into 
several pipeline stages. For example, the matching- 
memory section has four elastic pipeline stages. In the 
first stage, two prematched packets from the upper and 
lower paths merge into just one path. Memory reads 
occur in the second stage, while the third stage com¬ 
pares tag information between the read-out data and the 
input packet. A memory write can take place in the last 
stage according to the result of the comparison. 

Language processing 

As previously mentioned, the data-driven processor 
executes a class of dataflow instructions based on the 
diagrammatical data-driven language, D3L. Figure 15 
shows the flow of the language-processing system in 
which a load module in machine code is generated as a 
combination of C-like textual source programs and 
assembly-language representations. 

Compiler. We developed a compiler that 

• checks the syntax of the text and extracts control 
structures, 

• analyzes data dependencies among variables in the 
program and produces a data-dependency list, and 

• generates an object module based on the data- 
dependency list. 

The compiler automatically appends the operations 
needed for the control structure such as loops and 
conditional processes. 


Assembler. In addition, an object module can be 
directly generated from textual assembler source pro¬ 
grams in which data dependency is directly specified in 
a textual format. This approach lets the programmer 
optimize program structures. Current compiler devel¬ 
opment does not ensure that the compiler will generate 
the best structure to critical portions of the program. 



Figure 15. Flow graph of C-like and assembly-lan¬ 
guage-processing system. 


Prematching unit Matching-memory unit Packet-assembly unit 



Figure 14. Block structure of the firing-control unit. 


June 1989 55 











































































































































Data-driven microprocessor 



Figure 16. Programming-language examples: C-like (a) and 
assembly (b). 


Simulator. We developed a simulator to 
evaluate the effect of load-module mapping 
onto processors. The simulator attunes fine- 
grain mapping of a target module with vari¬ 
ous topologies in multiprocessor systems. 


Textual source programs. Figures 16a 
and 16b show examples of source programs 
written in C-like and assembly languages. 
These two programs have exactly the same 
meaning. Dataflow graphs that are equivalent 
to the load module generated from C-like or 
assembly-language programs are shown in 
Figure 17a and 17b. Rank refers to the critical 
path length of the node, counting from the 
input node. The path length is determined by 
the number of arcs between these two nodes. 
The load module generated from the assem¬ 
bly-language program shows better runtime 
efficiency. 


Applications 

Figure 18 shows the processor’s support 
environment for developing application pro¬ 
grams. This environment is constructed by 
the host computer and the emulator, which 
connect to either an RS-232C interface or a 
general-purpose interface bus (GPIB). The 
emulator consists of the data-driven proces¬ 
sor on the board computer and such periph¬ 
eral boards as external program store, exter¬ 
nal data memory, and host interface. Table 2 
on the next page summarizes board specifica¬ 
tions. The host computer achieves program 
loading and collection of the result data. 

A number of potential applications for the 
data-driven processor exist. 


Linker. Because object modules generated by the 
compiler and the assembler naturally have the same 
format, these modules can be freely linked into a load 
module. A rank analysis also occurs to determine criti¬ 
cal path lengths from the input nodes to each of the 
primitive nodes within the module. The result passes to 
a mapping program, which approximates the order of 
execution. 

Mapper. A user can specify to the mapper the func¬ 
tional-level mapping of load modules onto different 
individual processes in a multiprocessor system. The 
mapper can also automatically map the nodes to indi¬ 
vidual processors using rank-analysis results. 


Real-time processing of high-flow-rate data 
streams. Because of the dynamic data-driven scheme 
employed in the processor, one of the most promising 
fields for applications is real-time signal processing in 
which input data reaches the processor in data-stream 
form. High-speed signal and graphics processing are 
two potential applications for the processor. 

Multilevel event-driven processing. Since the data- 
driven processor is completely free from context¬ 
switching overhead, the processor exhibits very fast 
responses to external world events. This feature prom¬ 
ises high performance in applications such as robotics 
and machine control in which multilevel event-driven 
capability is crucial. 


56 IEEE MICRO 


















Input ix 


Constant ial 


ia2 


ia3 


ia4 


Computationally inten¬ 
sive problems. The feasibility 
of using a packet-oriented, 
data-driven processor in 
multiprocessor organization 
suggests that many computa¬ 
tionally intensive programs 
can be mapped on a data- 
driven multiprocessor system. 
Applications include the solu¬ 
tion of partial differential 
equations and the emulation of 
neural networks. 


O ne of the major claims 
of the data-driven pro¬ 
cessor is that it is a 
scheme inherently free from 
any side effects. Another 
claim is that the processor’s 
visual representation of pro¬ 
grams enhances programmer 
productivity. To achieve these 
claims, one absolutely needs a 
comfortable programming 
environment in which the 
graphical specification of a 
target problem is easy to per¬ 
form. Therefore, high-level 
graphical specifications are 
necessary to popularize the 
data-driven programming 
paradigm. We continue to 
investigate these needs. 

Although we observed a 
peak performance of 40 
Mflops in the elastic pipeline 
and designed the one-board, 
data-driven computer to oper¬ 
ate at a speed of 20 Mflops, the 
entire system operates at 13 
Mflops. The performance 
degradation of the entire sys¬ 
tem from on-chip processes 
apparently reveals a heavy 
communication penalty due to 
off-chip data transfers. The 
highest flow of data through 
the processor is exposed to 
speed-constrained buffer driv¬ 
ers. Clearly we have to inte¬ 
grate essential functional 
blocks of the processor onto a 



Rank 4 

Rank 5 

Rank 6 

Rank 7 



Host interface Board Interface board 



Figure 18. Development-support environment. 


June 1989 57 


























































































































Data-driven microprocessor 


Table 2. 

Board specifications. 

Type 

Data 

Performance 

20 Mflops 

Data length 

32 bits 

Data type 

Floating-point/integer 

Program memory 

Internal: 1-Kbyte steps 

size 

External: 512-Kbyte steps 

Data memory 

External: 4 Gbytes 

size 


External memory 

10 Mwords/second 

access rate 

(1 word = 4 bytes) 

I/O interface 


Multi PE 

Two 42-bit in ports 

connection 

Two 42-bit out ports 

External program 

One 42-bit data port 

ports 

One 74-bit address port 

External data 

One 42-bit data port 

ports 

One 58-bit address port 

Board size 

48 X 38.5 centimeters 


3. D. Pountain, “Microprocessor Design—The Transputer 
and Its Special Language, Occam,” Byte, Vol. 9, No. 8, 
Aug. 1984, pp. 361-366. 

4. J. Backus, “Can Programming Be Liberated from the von 
Neumann Style?” 1977 ACM Turing Award Lecture, 
Comm. ACM, Vol. 21, No. 8, Aug. 1978, pp. 613-641. 

5. W. Myers, “DeMarco Foresees Change to Parallel 
Processing,” IEEE Software, Vol. 5, No. 2, Mar. 1988, 
pp. 92-93. 

6. J.R. Gurd, C.C. Kirkham, and I. Watson, “The Man¬ 
chester Prototype Dataflow Computer,” Comm. ACM, 
Vol. 28, No. 1, Jan. 1985, pp. 34-52. 

7. J.B. Dennis, G. Gao, and K. W. Todd, “Modeling the 
Weather with a Data Flow Supercomputer,” IEEE Trans. 
Computers, Vol. C-33, No. 7, July 1984, pp. 592-603. 

8. T. Yamasaki et al., “VLSI Implementation of the Vari¬ 
able Length Pipeline Scheme for Data-Driven 
Processors,” Digest Symp. VLSI Circuits, Aug. 1988, pp. 
31-32. 

9. S. Komori et al., “A 40-Mflops 32-Bit Floating-Point 
Processor,” Digest Int’l Solid-State Circuits Conf, Feb. 
1989, pp. 46-47. 

10. H. Terada et al., “VLSI Design of a One-Chip Data- 
Driven Processor: Q-vl,” Proc. Fall Joint Computer 
Conf., IEEE CS Press, Los Alamitos, Calif., Oct. 1987, 
pp. 594-601. 

11. K. Asada et al., “Hardware Structure of a One-Chip Data- 
Driven Processor: Q-p,” Proc. 16th Int’ l Conf. Parallel 
Processing, Aug. 1987, pp. 327-329. 

12. S. Komori et al., “An Elastic Pipeline Mechanism by 
Self-Timed Circuits,” J. Solid-State Circuits, Vol. 23, 
No. 1, Feb. 1988, pp. 111-117. 


single VLSI chip, and we are attempting to do that. We 
are also working on improving the instruction-set archi¬ 
tecture of the processor, si 


Acknowledgments 

Although it is impossible to thank individually all 
those who assisted with this research and development 
project, we express our sincere appreciation to all our 
colleagues. Without their endeavors, this processor 
would not exist. We also gratefully acknowledge T. 
Shichiku for his efficient support in proofreading our 
manuscript. 


References 

1. J.A. Sharp, Data-flow Computing, Ellis Horwood Ltd., 
Chichester, England, 1985. 

2. H. Nishikawa et al., “Architecture of a One-Chip Data- 
Driven Processor: Q-p,” Proc. 16th Int’l Conf. Parallel 
Processing, The Pennsylvania State University Press, 
University Park, Penn., Order No. FJ783, IEEE CS Press, 
Los Alamitos, Calif., Aug. 1987, pp. 319-326. 



Shinji Komori is a senior engineer in the LSI Research and 
Development Laboratory at Mitsubishi Electric Corporation. 
He is currently researching VLSI-oriented, data-driven archi¬ 
tectures and developing data-driven microprocessors. 

Komori received the BS degree in applied mathematics and 
physics from Kyoto University. He is a member of the Insti¬ 
tute of Electronics, Information and Communication Engi¬ 
neers of Japan (IEICE). 


58 IEEE MICRO 















Kenji Shima is a senior engineer in the Industrial Systems 
and Electronics Research and Development Laboratory at 
Mitsubishi Electric Corporation. He has been involved in the 
research and development of sound synthesis, voice analysis, 
and voice recognition since 1974 and is now interested in 
developing applications for data-driven processors. 

Shima received the BS and MS degrees in electrical engi¬ 
neering from Kobe University. He is a member of the IEICE 
and the Acoustical Society of Japan. 


Souichi Miyata is manager of the data-driven microproces¬ 
sor development group in the Integrated Circuits Group at 
Sharp Corporation. He has experience in the development of 
memory and logic devices and has been researching and 
developing highly parallel processing systems since 1982. 

Miyata holds a Doctor of Engineering degree in material 
sciences from Osaka University. He is a member of the IPSJ 
and Japan’s Society of Applied Physics. 


Hiroaki Terada is a professor at Osaka University where he 
is responsible for research and education in digital systems 
and circuits. His current research activities include develop¬ 
ing a diagrammatical data-driven language, a data-driven 
architecture, and its VLSI-oriented implementation. He was 
a visiting professor at the University of Essex, England, and 
a visiting scholar at the Centre National d’Etudes des Tele¬ 
communications, Lannion, France. 

Terada received the BE degree in electrical engineering 
from Ehime University and the ME and PhD degrees in 
electronic communications from Osaka University. He has 
served as chair of the Technical Group on Switching Engi¬ 
neering, IEICE, and as a member of the Board of Directors of 
the IPSJ. He received the IEICE Achievement Award in 1988 
and the IEICE Kobayashi Memorial Achievement Award in 
1989. 


Questions concerning this article can be directed to Shinji 
Komori, Advanced Microprocessor Development Depart¬ 
ment, LSI Research and Development Laboratory, Mitsub¬ 
ishi Electric Corporation, 4-1 Mizuhara, Itami, Hyogo Japan 
664. 





Toshiya Okamoto is a senior engineer in the Integrated 
Circuits Group at Sharp Corporation. He has been research¬ 
ing and developing highly parallel processing systems since 
1982. He currently develops dataflow microprocessor archi¬ 
tectures and software tools. 

Okamoto received the BS and MS degrees in system engi¬ 
neering from Kobe University. He is a member of the IPSJ. 


Reader Interest Survey 

Indicate your interest in this article by circling the 
appropriate number on the Reader Service Card. 

Low 156 Medium 157 High 158 


June 1989 59 




















SPECIAL FEATURE 


The Transputer 
T414 Instruction Set 



Have you 
wondered how to 
decipher the 
transputer 
instruction set? 
Here's your 
chance to 


M ultiprocessor systems offer a number of advantages over the single¬ 
processor configuration. However, multiprocessor systems require 
interprocessor communication, some form of time-slicing, and 
primitives for parallel operation. The transputer is a microprocessor designed 
specifically for a multiprocessor environment. Because of this, many of the 
instructions performed by the transputer are unfamiliar to designers of 
single-processor systems. In addition, Inmos Ltd.—the manufacturer of 
transputers—uses very short notations with implied operands within its 
assembly language. This approach differs greatly from the techniques used 
for 32-bit microprocessors like the ones in Motorola’s M68000 and National 
Semiconductor’s NS32000 families. 

Consequently, we wrote this article as a tutorial on the instruction set of the 
T414 transputer and include assembly-language notations that are more 
familiar to microprocessor-system designers than the Inmos documentation. 


compare it with 
something you 
know. 


Jean-Daniel Nicoud 
Andrew Martin Tyrrell 

Ecole Polytechnique 
Federate de Lausanne 


Transputer types 

All types of transputers have the same basic instruction set. The IMS T414 
and T800 transputers are powerful, pin-compatible, 32-bit processors 
especially designed for implementing parallel architectures. IEEE Micro has 
already covered the hardware and general-software aspects of these trans¬ 
puters. 1 Note that both the T414 and the T800 have four, high-speed serial 
links supported by internal direct-memory access (DMA) channels, two 
timers to assist in switching tasks, and internal static RAM (Figure 1). The 
T800 differs from the T414 by having 4 Kbytes of RAM (instead of 2 
Kbytes) as well as an on-chip floating-point unit, which was well described 
in Homewood et al. 1 Except for access time, no difference exists between in¬ 
ternal and external memory in either transputer. The total memory space for 
the T414/T800 is 4 Gbytes, although the design philosophy is that trans¬ 
puters do not need abundant memory. Rather more important is that a 
number of transputers communicate through their four dedicated serial 
channels at speeds of up to 20 Mbits per second. Because the address and 
data buses are multiplexed, the resulting package contains only 68 pins. 

Downloading a bootstrap program into internal or external memory can 
be accomplished through the serial links. Hence, a minimum, additional 
transputer can consist of a single chip. 


60 IEEE MICRO 


0272-1732/89/0600-0060S01.00 © 1989 IEEE 

















The T212 is the 16-bit version of the transputer. A 
word is 32 bits wide on the T414/T800 and 16 bits on 
the T212. To make the T212 architecture and instruc¬ 
tion set compatible with those of the T414, the assem¬ 
bler refers to a bit count of 16 instead of 32 and multi¬ 
plies by two instead of four in terms of bytes. This ap¬ 
proach is quite different from the process involved with 
M68000/MC68020 compatibility, for instance, because 
the data structures change rather than the number of 
transfer cycles. 

The M212 is an especially programmed disk proces¬ 
sor. It contains a modified T212 16-bit, on-chip mem¬ 
ory, and two byte-wide bidirectional ports. Disk- 
interface hardware provides for a standard floppy or 
Winchester disk interface. A set of procedures in 
ROM performs basic disk accessing. 

The T414 is a reduced instruction set computer 
(RISC) primarily because its architecture does not con¬ 
tain the standard registers and the more-or-less or¬ 
thogonal operations of conventional microprocessors. 
Its instruction set is simple and efficient; the instruction 
cycle is very short. 

Inmos designed the concurrent language Occam 1 ’ 2 
for the transputer to have both high- and low-level 
facilities. Occam specifically programs a set of com¬ 
municating processes that take place between trans¬ 
puters via channels. The instruction set was designed to 
promote Occam compiler efficiency. 


Registers, memory, 
and communications 

The architecture of the T414 is rather simple and bor¬ 
rows architectural ideas from Texas Instrument’s 
TMS9000 microcomputer and the old Hewlett-Packard 
calculators. 

Registers. As summarized in Figure 2, the T414 pro¬ 
cessor consists of an instruction pointer I (the usual 



20Mb/t 

serial 

links 


Figure 1. Block diagram of the T414/T800 transputer. 



A = A register 
B = B register 
C = C register 
E = Error flag 
H = Halt-on-error flag 
W = Workspace pointer 


Figure 2. Processor model. 


June 1989 61 













































































T414 instruction set 


The T414 instruction set 
is simple and efficient. 


program counter) and an invisible instruction-fetch 
buffer of two 32-bit registers. The processor extracts 
8-bit instructions from this buffer to prepare the con¬ 
tents of the 32-bit operand register O. The usual micro¬ 
processor instruction register splits into the operand 
register O and the 4-bit function code of the last byte of 
each instruction, as shown later. 

Memory. The 32-bit workspace pointer W can point 
anywhere in memory, making context switching easy 
and fast. A single-byte instruction accesses the first 16 
locations. Three 32-bit registers A-C form an arith¬ 
metic evaluation stack. They should not be seen as a set 
of independent registers, which is the way they usually 
function on many other microprocessors. The compiler 
allocates room for user registers inside the workspace 
of a process. Parameters, even temporarily, are never 
kept in the evaluation stack. Instructions that do not 
use registers A, B, or C, like Jumps, leave these registers 
in some unpredictable state. As detailed later, the usual 
condition status register (CSR or flags) does not exist. 
Overflow indications or results of a comparison remain 
in register A. An error flag E and a halt-on-error flag H 
exist for handling overflows, but have no correspon¬ 
dent in standard microprocessors. 

The processor maintains two linked scheduling lists 
for simulating parallel operations. Four registers 
specifically form these linked lists. Of the two registers 
for high-priority processes, one (FPtrRegO) points to 
the front of the list and the other (BPtrRegO) to the 
back. Two similar registers (FPtrRegl and BPtrRegl) 
perform low-priority processes. The process-sched¬ 
uling registers are detailed later. 

The local timing operations on the transputer employ 
six registers: 

• two clock (timer) registers (ClockRegO and 
ClockRegl—one for each priority level), 

• two registers pointing to the first items on the two 
priority timer queues (TPtrLocO and TPtrLocl), and 

• two registers indicating the time of the first event to 
occur (TNextRegO and TNextRegl—again, one for 
each priority level). 

Two bits indicate whether there is anything on either of 
the timer queues (TEnabledO and TEnabledl). The 
high-priority clock increments every 1 /rs, the low- 
priority clock every 64 /rs. 


Communications. A channel provides a communica¬ 
tion path between two processes. Single words in 
memory (internal channels) implement channels be¬ 
tween processes executing on the same transputer. 
Point-to-point links (external channels) implement 
channels between processes executing on different 
transputers. The compiler allocates the memory loca¬ 
tion for internal communications. Four external channel 
locations are reserved at what is considered as the bottom 
of transputer memory (H’80000000 to H’8000001C, 
where H’ indicates HEX, or hexadecimal). (See Re¬ 
served I/O area in Figure 2.) 

The implementation of external communications 
uses three invisible, separate registers to support 
autonomous DMA, which frees the processor for other 
work. These link registers hold 

• the count of the number of bytes to be transferred, 

• a pointer to the location in memory (to input or 
output), and 

• a pointer to the workspace of the process. 

When control is transferred to the link registers, the 
process is removed from the schedule. Once the com¬ 
munication is complete, the link interface signals the 
processor to add the process to the end of the list of ac¬ 
tive processes. 

Addressing modes 

As in all RISCs, the addressing modes are very sim¬ 
ple. The Common Assembly Language for Micropro¬ 
cessing (CALM) notations 3 ' 4 appear here in addition to 
the Inmos notations. 5 ’ 6 Those Inmos notations that 
differ from CALM appear in italics. 

The main conventions of CALM are the use of 

• A, B, etc., for register names (reserved symbols); 

• ADDR expressions (numbers) for memory or I/O 
addresses; 

• [AJ, [ADDR] for contents of registers A or ADDR 
(corresponding to a value or a memory address); 

• [Aj + DISP memory location for the address that is 
the sum of the contents of register A and the value 
DISP;and 

• [A + J notations to indicate that the contents of reg¬ 
ister A are incremented after the transfer of the location 
pointed to by register A. 

Immediate mode. The immediate-addressing mode 
(named load constant by Inmos) takes the value pre¬ 
pared by the instruction in the operand register. Only 
32-bit transfers are possible, but short instructions exist 
to move 4, 8, 12, ... , 32-bit values zero- or sign- 
extended to 32 bits. The sole destination is the evalua¬ 
tion stack. For instance, the following instructions (in 
the following examples, the CALM notation appears 
first, then the Inmos notation with an explanation of 
the instruction): 


62 IEEE MICRO 












Table 1. 

Comparison of CALM and Motorola 

notations. 

CALM 

M68000 


MOVE.32 
MOVE.32 
MOVE.32 

{(A6j + DISP1 * 4) + DISP2 * 4,DO MOVE.32 
[A6j + DISP1*4,A6 MOVE.32 

[A6] + DISP2 * 4, DO MOVE.32 

([DISP1 *4, A6,ZA0* 1,DISP2*4),D0 
(DISP1 *4,A6),A6 
(DISP2 * 4,A6),D0 


MOVE.32 #3,A-B-C {Idc 3, load constant) 

copy the value 3 into register A, A copies into B, and B 
copies into C. 

CALM refers to the Inmos load-pointer mode with 
the term “generalized immediate addressing” (other 
microprocessors use the term “load address”) as 
follows: 

MOVE.32 #(Aj + DISP,A {Idnlp disp, load nonlocal 
pointer, that is, load pointer from memory outside 
the workspace). 

Notice that this instruction addresses a word that is 
aligned with a multiple of two in the T212 and a multi¬ 
ple of four in the T414. It is equivalent to an ADD.32 
#DISP,A instruction and loads into A the contents of 
register A (considered as a pointer to a table), plus the 
word displacement DISP. 

Relative mode. Absolute addressing does not exist on 
the transputer, except for special instructions in which 
the Inmos notation is implied for timer, communica¬ 
tion, and scheduling locations. Relative addressing is 
only available for Jump instructions and guarantees the 
relocatability of modules. Displacement can be a 4, 9, 
13, ... , 32-bit signed value. For instance, 

JUMP ADDR (J addr, jump to address addr) 

jumps to the location ADDR; the value coded in the in¬ 
struction is the displacement from the instruction 
pointer. 

Register-indirect mode. Register-indirect addressing 
is called local when it refers to the workspace register 
W, and nonlocal when it refers to the A register as a 
base pointer. For instance, because DISP is a word 
displacement, one can write 

MOVE.32 A,[W] + DISP (stl disp, store local) 
MOVE.32 (A) + DISP,A (Idnldisp, load nonlocal) 
MOVE.8 (A),A {lb, load byte) 

MOVE.32 [A] + 4* [B},A {wsub, word subscript) 

The double-indirect mode, named static chain by In¬ 
mos, can be written in CALM: 

MOVE.32 (jWJ + DISP1] + DISP2,A 

This mode is implemented by two transputer instruc¬ 
tions (preceded by their CALM equivalents). 


MOVE.32 (W) + DISPl.A-B-C (Idl displ, load 

local) 

MOVE.32 [A] + DISP2,A (Idnl disp2, load 

nonlocal) 

Table 1 provides equivalent M68000 notations for the 
previous three CALM instructions. Transputer regis¬ 
ters W and A are mapped onto MC68020 registers A6 
and DO. 


Data types 

The T414 supports only 8- and 32-bit integers. The 
8-bit numbers are extended to zero when used as 32-bit 
values. The T800 also supports 32- and 64-bit floating¬ 
point numbers. The 32-bit addresses and integers are 
signed (two’s complement). Address 2 31 = H’80000000 
(MostNeg , most negative) is reserved for communica¬ 
tion and scheduling. Addresses of 32-bit words (data, 
pointers) must be word aligned in memory; the last two 
bits of these addresses, named a byte selector, must be 
zero. 


Programming philosophy 

Due to the architecture and reduced instruction set of 
the transputer, its programming philosophy is quite dif¬ 
ferent from that of other 32-bit processors like the DEC 
VAX, Motorola M68000, National Semiconductor 
NS32000, or Intel iAPX-386. Transputer workspace, 
stack, and overflow handling require a complete 
change in thinking for the programmer who is ac¬ 
customed to 8-, 16-, and 32-bit sequential processors. 

Workspace. Each parallel process has a workspace 
associated with it, that is, a block of 32-bit words in 
memory. Register W points to the beginning of this 
block during the execution of the process. The local 
variables and the return address are both reached by 
positive offsetting. Negative offset addresses are used 
for the timer and communications. The workspace ad¬ 
dress and its priority level, given in the process descrip¬ 
tor, completely identify a process. The priority of a pro¬ 
cess is stored in the least significant bit (LSB) of the 
workspace address. 


June 1989 63 





T414 instruction set 


offset 



Figure 3. Workspace and subworkspace during a subroutine. 


Calling a subroutine creates a subworkspace. The 
system automatically reserves four words—the return 
address (instruction pointer I) and the three evaluation 
stack registers (A, B, and C), as shown in Figure 3. Ad¬ 
just instructions allow space to be allocated for vari¬ 
ables. The system reserves locations above the main 
workspace for process scheduling. 

Evaluation stack. Load instructions prepare the 
evaluation or operand stack. The register pair A,B can 
be exchanged by using the transputer Reverse (rev) in¬ 
struction. Two-operand Arithmetic and Logic instruc¬ 


tions use registers A and B, with the result placed in A. 
The contents of register C usually move into B; the new 
contents of C remain undefined. 

Some instructions take their operands from the 
stack. Move Block instructions, for instance, consider 
C as the source pointer, B as the destination pointer, 
and A as the block-length counter. 

Overflow handling. Unlike other microprocessors, 
the transputer has no equal or carry flags. Rather, at the 
end of some operations, register A is loaded with a 
single bit. The meaning of this bit corresponds to one 
flag. The error-flag bit is set by several instructions 
when the result overflows (see section on arithmetic 
instructions). 


Concurrent processes 

The transputer contains a real-time, hard-wired 
kernel. A process is defined by its workspace address. 
Workspaces are linked to form two lists of waiting pro¬ 
cesses (priority 0 = high and 1 = low). Special instruc¬ 
tions (shown later) allocate and link processes. 

Process scheduling. Dedicated registers in the pro¬ 
cessor point to the front and back of the active process 
lists, as shown in Figure 4. One can see four processes 
waiting to be executed: three are high priority and one is 
low. FPtrRegO points to the workspace of the process 
(value Wl) at the front of the high-priority list (a). The 
list pointer of Wl (noted Wl-2 and found at address 
(FPtrRegO]-2 *4) points to the next process’ 
workspace W3 (b). The word at address Wl-1 contains 


FPtrRegO 


BPtrRegO 



Program to be executed 


- (e) 

FPtrRegl - 1 


L-* 


; ; W 

4 




W4 

-i 


BPtrRegl |—J 


: W4 

-2: 


- 


One process in priority t list 


Three processes in priority 0 list 


Figure 4. Typical process lists. 


64 IEEE MICRO 
































































































SUBSCRIPTION CARD READER-SERVICE CARD READER-SERVICE CARD 


IEEE 


MICRQ/ 


Reader Interest 

(Add comments on the back) 


June 1989 issue (card void after December 1989) 


Name 

Title 


Company _ 
Address _ 
City_ 


State. 


ZIP_ 


Country. 


Phone ( 


.). 


Please send (Circle those you want) 


201 Publications catalog 

202 Membership information 

203 Student membership information 

204 Senior membership information 

205 IEEE Micro subscription information 


Readers, 




Indicate your interest in 
articles and departments by 
circling the appropriate 
number (shown on the last 
page of articles and 
departments): 

150 159 

168 

177 

186 

151 160 

169 

178 

187 

152 161 

170 

179 

188 

153 162 

171 

180 

189 

154 163 

172 

181 

190 

155 164 

173 

182 

191 

156 165 

174 

183 


157 166 

175 

184 


158 167 

176 

185 



Product Information 1 

(Circle the numbers to receive product 
information) 


1 

21 

41 

61 

81 

101 

121 

141 

2 

22 

42 

62 

82 

102 

122 

142 

3 

23 

43 

63 

83 

103 

123 

143 

4 

24 

44 

64 

84 

104 

124 

144 

5 

25 

45 

65 

85 

105 

125 

145 

6 

26 

46 

66 

86 

106 

126 

146 

7 

27 

47 

67 

87 

107 

127 

147 

8 

28 

48 

68 

88 

108 

128 

148 

9 

29 

49 

69 

89 

109 

129 

149 

10 

30 

50 

70 

90 

110 

130 

— 

11 

31 

51 

71 

91 

111 

131 

— 

12 

32 

52 

72 

92 

112 

132 

192 

13 

33 

53 

73 

93 

113 

133 

193 

14 

34 

54 

74 

94 

114 

134 

194 

15 

35 

55 

75 

95 

115 

135 

195 

16 

36 

56 

76 

96 

116 

136 

196 

17 

37 

57 

77 

97 

117 

137 

197 

18 

38 

58 

78 

98 

118 

138 

198 

19 

39 

59 

79 

99 

119 

139 

199 

20 

40 

60 

80 

100 

120 

140 

200 


IEEE 


MICRQ; 


Reader Interest 

(Add comments on the back) 


Product Information 2 

(Circle the numbers to receive product 
information) 


June 1989 issue (card void after December 1989) 


Name 
Title _ 


Company. 
Address _ 
City_ 


State. 


ZIP, 


Country. 


Phone (_ 


-). 


Please send (Circle those you want) 


201 Publications catalog 

202 Membership information 

203 Student membership information 

204 Senior membership information 

205 IEEE Micro subscription information 


Readers, 



Indicate your interest in 
articles and departments by 
circling the appropriate 
number (shown on the last 
page of articles and 

departments): 


150 159 

168 

177 186 

151 160 

169 

178 187 

152 161 

170 

179 188 

153 162 

171 

180 189 

154 163 

172 

181 190 

155 164 

173 

182 191 

156 165 

174 

183 

157 166 

175 

184 

158 167 

176 

185 


1 

21 

41 

61 

81 

101 

121 

141 

2 

22 

42 

62 

82 

102 

122 

142 

3 

23 

43 

63 

83 

103 

123 

143 

4 

24 

44 

64 

84 

104 

124 

144 

5 

25 

45 

65 

85 

105 

125 

145 

6 

26 

46 

66 

86 

106 

126 

146 

7 

27 

47 

67 

87 

107 

127 

147 

8 

28 

48 

68 

88 

108 

128 

148 

9 

29 

49 

69 

89 

109 

129 

149 

10 

30 

50 

70 

90 

110 

130 

— 

11 

31 

51 

71 

91 

111 

131 

— 

12 

32 

52 

72 

92 

112 

132 

192 

13 

33 

53 

73 

93 

113 

133 

193 

14 

34 

54 

74 

94 

114 

134 

194 

15 

35 

55 

75 

95 

115 

135 

195 

16 

36 

56 

76 

96 

116 

136 

196 

17 

37 

57 

77 

97 

117 

137 

197 

18 

38 

58 

78 

98 

118 

138 

198 

19 

39 

59 

79 

99 

119 

139 

199 

20 

40 

60 

80 

100 

120 

140 

200 


SUBSCRIBE TO IEEE MICRO 


□YES, sign me up! 

If you are a member of the Computer Society or any other IEEE society, 
pay the member rate of only $18 for a year’s subscription (six issues). 

Society: IEEE membership no.: 

Society members: Subscriptions are annualized. For orders submitted March through 

August, pay half the full-year rate ($9) for three bimonthly issues. 


□YES, sign me up! 

If you are a member of ACM, ACS, BCS, IEE (UK), IEEE (but not a 
member of an IEEE society), IECEJ, IPSJ, NSPE, SCS, or other qualified 
professional technical society, pay the sister-society rate of only $33 for a 
year’s subscription (six issues). 

Organization: Membership no.: 





□ Payment enclosed 


Full Signature 

Date □ Charge to □ Visa 

Q MasterCard Q American Express 

Name 



Street 

Expiration date 

Charge orders also taken by phone: 

(714)821 -8380 8a.m. to5 p.m. Pacific time 
Circulation Dept. 

10662 Los Vaqueros Cir. 

City 

Month Year 

Prices valid throuqh 12/31/89 

State/Country 

ZJ PI Postal Code 6/89 MICRO 

LosAlamitos, CA 90720-2578 


















































Editorial comments 

Hiked: _ 


PO Box is for reader-service cards only. 


PLACE 

POSTAGE 

HERE 


Idisliked: 


I would like to see: 


For reader-service inquiries, see other side. 


IEEE Micro 

Reader Service Inquiries 
PO Box 16508 

North Hollywood, CA 91615-6508 


II 


Editorial comments 

Hiked: _ 


nil.II.Ilililiilliiilililliiiliililil.il 


PO Box is for reader-service cards only. 


PLACE 

POSTAGE 

HERE 


- 

I disliked: 


-# - 

I would like to see: 


For reader-service inquiries, see other side. 


IEEE Micro 

Reader Service Inquiries 
PO Box 16508 

North Hollywood, CA 91615-6508 
USA 


II.I.M..11.11.II.I.I..II,..1.1.II...I,.1.1.1..I 


-1 


BUSINESS REPLY MAIL 

FIRST CLASS PERMIT NO. 38 LOSALAMITOS, CA 


POSTAGE WILL BE PAID BY ADDRESSEE 


NO POSTAGE 
NECESSARY 
IF MAILED 
IN THE 

UNITED STATES 


IEEE COMPUTER SOCIETY 

Circulation Dept. 

10662 Los Vaqueros Cir. 

Los Alamitos, CA 90720-9804 
USA 


11 >1 ii III. i Iml. ill! II ml. I.. I..I. I It. <.|.. I... I 




























































WORKSPACE 


POSSIBLE VALUES 




w 

— 

W-1 


W-2 

— 

W-3 


W-4 


W-5 

— 


-1 after altwt or taltwt 
A after diss, disc or dist 

pointer to next instruction when descheduling 

workspace pointer to next process in list 
of executable processes 

Minlnt + t after alt or talt 
Minlnt + 3 after enbs, enbc or enbt 

Minlnt + 1 after enbt 
Minlnt + 2 after talt 

time-out vaiue after enbt 


Figure 5. Workspace locations for the Alternative (ALT) instruction. 

a pointer to the first instruction to be executed when the 
process is executed. BPtrRegO points to the workspace 
of the last process (d) as does W3-2. As only three pro¬ 
cesses occupy this list, the process’s workspace pointed 
to by (c) and (d) is the same (W2). For the single low- 
priority process W4, both front and back pointers 
(FPtrRegl and BPtrRegl) point to the same process’ 
workspace W4 (e,f). 

A new process is placed at the end of the high- or low- 
priority list. When the current process is removed from 
the schedule, it is placed at the back of the list by up¬ 
dating BPtrReg, and the process at the front of the list is 
executed. 

A high-priority process always takes precedence over 
a low-priority process. A low-priority process is remov¬ 
ed from the schedule when a high-priority process 
becomes available for execution. When no more high- 
priority processes exist, the low-priority process con¬ 
tinues to execute. 

Seven locations near the bottom of the transputer’s 
memory map hold the state of a preempted, low-priority 
process. One process, at most, can be preempted at one 
time to prepare seven, 32-bit memory locations. 

Alternatives. The Occam Alternative (ALT) instruc¬ 
tion is a very powerful construct that allows choices be¬ 
tween a number of branches. The testing of inputs 
and/or Boolean conditions—with the possible inclu¬ 
sion of timers—provides the information to make these 
choices. The transputer contains 11 instructions to 
realize this construct. Specific locations in the 
workspace of a process implement this instruction. The 
workspace notations shown in Figure 5 correspond to 
any of the four workspaces shown in Figure 4. W, W-l, 
and W-2 have the same meaning as Wl, Wl-1, and 
Wl-2 for process scheduling. W-3 signals a satisfied 
alternative, with Minlnt + 3 representing the value true 
and Minlnt + 1 representing the value false (Minlnt, or 


the Minimum Integer value is equal to 2 31 on the T414 
and 2 15 on the T212). W-4 is used if a timer is included 
for a time-out. It signals an enabled timer, with Min¬ 
lnt-I-1 as true and Minlnt+ 2 as false. Finally, W-5 
contains the earliest time a time-out can occur (if one is 
specified within the ALT instruction). Figure 5 sum¬ 
marizes these values. Workspace content depends upon 
the instructions that are executed.The instructions men¬ 
tioned in Figure 5 are explained later. 

Timer queue. The Timer Input (tin) instruction can 
suspend processes on a transputer until a specified time. 
The suspended processes are held in one of two linked 
lists (one for each priority level). Each list has a slightly 
different structure, as shown in Figure 4. W-4 in Figure 
5 contains a pointer to the next process in the list, except 
for the last process, in which W-4 contains Minlnt. 
Location W-5 holds the designated time for awakening 
the process. The process order in the list is determined 
by the value in W-5. W-3 always contains the value 
Minlnt + 2 for a process on a timer queue. Locations 
W, W-l, and W-2 are used as previously described. 


Structure 

The basic transputer instruction set consists solely of 
1-byte instructions. To obtain instructions that look 
similar to those of the usual microprocessor, one needs 
several transputer instructions. For instance, the 
previously mentioned Move Block instruction starts 
with its operand in the workspace. It takes three in¬ 
structions to move the registers onto the evaluation 
stack before the transputer Move Block instruction can 
occur. In the best case, if the operands are initially in 
the first sixteen, 32-bit word locations of the work¬ 
space, this set of instructions takes 5 bytes. In the worst 
case, it can take up to 26 bytes. 


June 1989 65 


















T414 instruction set 


100 irniioiioi 


10 

F 



1 

0 

- 1 
- 2 


-10 
-1 1 


- FF 
- 100 
- 101 
- 102 



Figure 6. Coding of several positive and negative integers. 


This difference in length is due to the particular en¬ 
coding of the offsets and immediate values. As men¬ 
tioned, a basic instruction consists of a 4-bit operation 
code and a 4-bit operand, which are loaded into the 
operand register. The operand register is cleared at the 
end of all instructions. Values larger than 4 bits can be 
prepared in the operand register by using two basic in¬ 
structions, Prefix ( pfix ) and Negative Prefix (nfix). The 
Prefix instruction loads its 4-bit operand into the 
operand register and then shifts the contents of the 
operand register O to the left by 4 bits (Figure 2). This 
procedure allows the preparation of instructions that 

require a 4-, 8-, 12-.32-bit positive value or 

displacement. Negative values or displacements must 
be prepared by appropriate use of the Negative Prefix 
instruction, which loads the 4-bit operand in the low bit 
of the operand register, complements the complete reg¬ 
ister, and then shifts 4 bits to the left. 

Figure 6 illustrates how the Occam compiler uses this 
process to encode operands between the hexidecimal 
values 100 and - 102 (instruction codes or prefixes are 
shaded). For instance, value - 100 is encoded H’6FX0, 
where 6 is the negative prefix, and F is a first digit loaded 
into the operand register (and is immediately comple¬ 
mented). X (shown as a shaded blank) is the 4-bit code 
of the instruction, and 0 is the second digit that must be 
shifted into the operand register to obtain the result. 
Students can gain from the exercise of writing the 
routine to prepare the correct digits according to the 
number. 

As an example of this mechanism within instruc¬ 
tions, let us consider the Jump instruction, the 4-bit 
code of which is zero (Figure 7). If the displacement is 
less than 4 bits and is positive, a single byte is sufficient 
to code the Jump instruction. The value added to the in¬ 
struction pointer I is the contents of the operand regis¬ 
ter that has been loaded with the 4-bit value. 

Using a Prefix (pfix with a 4-bit code 2) or Negative 
Prefix (nfix with a 4-bit code 6) instruction before the 


basic Jump instruction allows a positive 8-bit operand, 
or a negative 4-bit one. The Prefix instruction can be 
used as many times as required, as shown in Figure 7. 
This procedure provides rather short instructions when 
the operands are short. However, when 32-bit operands 
are used—which is rarely—the instructions are not very 
efficient. 

Prefix instructions also extend the instruction set up 
to 2 32 instructions. The T414 implements less than 512 
instructions by using one Prefix instruction and a 
special Operate (operate) instruction coded with the 
hexadecimal number F. This coding means that the 
operand register now contains the instruction code. 
These instructions have no explicit operand. 

Hence, transputers have two basic Prefix instruc¬ 
tions, 13 basic instructions that use the operand register 
as an immediate value or displacement, and one Oper¬ 
ate instruction. This arrangement allows the building of 
16 short instructions from the 4-bit associated value. It 
also allows 512 instructions that have the format 
2n!Fn 0 or 6n 1 Fn 0 , where 2 and 6 are the code of the 
positive and negative Prefix instructions, and F is the 
Operate instruction. 

CALM and Inmos notations 

These notations complement each other. CALM 
describes exactly what each instruction does at the reg¬ 
ister transfer level, while the Inmos notation em¬ 
phasizes the use of the instruction when it is handling 
the usual data structures. However, CALM has been 
designed for microprocessors with several registers and 
memory spaces and a rather orthogonal set of nota¬ 
tions. It is not very well suited for stack machines, since 
CALM expresses everything the instruction does (ex¬ 
cluding action on flags). The evaluation stack of the 
T414 is a region in which several transfers can occur si¬ 
multaneously. The transfer rules (that is, which regis¬ 
ters are updated or modified) are sometimes not in- 


66 IEEE MICRO 

































tuitive. CALM mentions only the major operands; sec¬ 
ondary registers that are modified are shown between 
brackets, as are the flags. 

The Inmos notations do not explicitly specify the 
transputer registers. The evaluation stack and the 
workplace pointer are never mentioned. These instruc¬ 
tions are hence very short and easy to enter, but not 
very readable to a programmer unfamiliar with the 
processor. 

Inmos sometimes uses condensed notations that cor¬ 
respond to several instructions. For instance, the Inmos 
notation for the Move Block instruction, assuming the 
three operands are at offsets SOURCE, DEST, and 
LENGTH from the top of the workspace, is move. The 
Occam notation for this is vl : = v2, which means that 
destination vector vl becomes vector v2. This notation 
corresponds to 

MOVE.32 [WJ + SOURCE, A-B-C (Idl source, 

load local) 

MOVE.32 [WJ + DEST,A-B-C (Idl dest) 

MOVE.32 [WJ +LENGTH,A-B-C (Idl length) 

REP [A] MOVE.8 [C + },(B + ](moue) 

where REP [A] MOVE.8 [C + J,(B + J means repeat the 
Move Byte instruction the number of times indicated by 
the contents of register A. 

Homewood et al. 1 uses the Move instruction in 
discussion of Move2d, which moves a block of data 
corresponding to a rectangle on a screen. In addition to 
the Move instruction, the compiler generates tens of in¬ 
structions for preparing the workspace and giving the 
control back to the calling module. 


Instruction set 

We present the T414 instructions in the order used by 
most microprocessor manufacturers. Inmos has a 
slightly different order, since it tends to group the in¬ 
structions that use the evaluation stack, the workspace, 
and the channels. 


The remaining figures in this article present (from 
left column to right) 

• the instruction code with shaded codes or prefixes, 

• the Inmos notation for the same code, 

• the Inmos mnemonic instruction and its explanation, 

• the CALM mnemonic and operand expressions 
that clearly show the addressing modes, and 

• a list of the registers that are modified. 

[A] means that register A is modified according to the 
logic of the instruction. [A: = B] means that register A 
is loaded with the contents of B. When a register is left 
undefined by an instruction, the symbol <£ appears (for 
example, [A$] stands for A undefined). 

Jump instructions. These instructions cause a devia¬ 
tion from the normal sequential flow (see Figure 8 on 
the next page). Surprisingly, registers A, B, and C are 
modified by a Jump or Call instruction. The Jump 
Relative and Conditional Jump instructions (j and cj) 
are relative to the instruction pointer. The Conditional 
Jump instruction is only performed if the contents of 
register A = O. Note that there is no flag register; the 
system checks the condition on the top of the stack 
instead. 

The Loop End instruction (lend) controls replica¬ 
tions. A loop is controlled by two words in memory 
pointed to by register B (addresses [Bj and [BJ -l- 4). The 
first word is the control variable incremented when it 
exits the loop; the second word is the number of itera¬ 
tions that remain. Register A holds the number of bytes 
between the start of the next instruction and the begin¬ 
ning of the loop. The Loop End instruction checks the 
number of iterations, [BJ + 4. When the number is 
greater than 1, the instruction jumps back to the start of 
the loop, increments the control variable, and decre¬ 
ments the iterations. When the number is not greater 
than 1, the number of iterations is decremented, and 
execution is passed to the next instruction. 

The Call and Return instructions (call and ret) imple¬ 
ment procedures. The Call instruction pushes the in¬ 
struction pointer and the evaluation stack onto the 


8-blt Instruction 



0 / 

0 / 
0 I 

0 I 
0 i 


Jump to {l)+<0..F> 

Jump to {l)+<10..FF> 
Jump to {l}-<1..10> 

Jump to {I)+< 100..FFF> 
Jump to {!}-< 11.. 100> 


}■ JUMP ADDR 


64-bit instruction 


2n 

2» |2r 

In 

2,. [2h 

2n 

Sn 


BE 3 


are; 2 


aLi 


Jump to (l) + <1 0000000..FFFFFFFF> 


Figure 7. Jump instruction with different operand lengths. 


June 1989 67 































T414 instruction set 


Lift' B 

$in 

tftLfij 


0 

A 


2 2|F 1 


F/2 1 


I JUMP ADDR [A0,B0,C0,E0] 

jump relative (offset in bytes, ADDR = (I) ♦ . ..n # ) 

cl JUMP,AEQ ADDR [none or A : = B,B : = C,C0] 

conditional jump (If A?*0, A: = B,B: = C, otherwise jump with A,B,C unchanged) 

lend DJ.PL {B} + 4,{I}-{A} tA0,B0,C0,E0] 

loop end (if {B)*4 > 1; jump to (I)-(A), increment (B) and decrement (B)*4, otherwise decrement (B}*4 and continue) 


Sin 

ftftJL 


I2|;0| F/2 0 
F/6 


16 


call CALL ADDR [W,A := I.B0.C0] 

call (SUB. 32 #4-4 ,U; HOVE.32 1,(U); MOVE.32 A, (U)*4; MOVE. 32 B, (U}*8; HOVE. 32 C, (U) *12, see fig 2) 

ret RET [W] 

return (HOVE.32 (U),I; ADO.32 #4-4,U) 

gcall EX A, I [A] 

general call 


Figure 8. Jump instructions. 




4 

i 4 ! 2 

F/42 

.... &n 

7 



D 




1 


3in 

aft a 

3 


ftftJL 

E 


Sir 
aft a 

5 

I 1 

F/1 

|3 

P 

F/3B 


l F 

F/F 

|e 

F/E 

|l|:B 

F/1 B 


|2 

F/2 


| A 

F/A 


Figure 9. Transfer instructions. 


Idc MOVE. 3 2 

load constant (VAL = ... n ) 

mint MOVE.32 

minimum integer Minint = -2**31 

Idl MOVE. 3 2 

load local (DISP = 4* n ) 

stl MOVE. 3 2 

store local (DISP = 4«.. n ) 

Idlp MOVE. 3 2 

load local pointer (DISP = 4* . 


#val,a-b-c 

#Minlnt,A-B-C 

{W} + DISP,A-B-C 
A,{W) + DISP 
#{W} + DISP,A-B-C 


[A,B : = A,C : = B] 

[A,B : = A,C : = B] 

[A,B : = A,C : = B] 

[A : = B,B : = C,C0] 
[A,B: = A,C: = B] 

[A] 

[A : = C,B0,C0] 

[A] 

[A] 

[A : = C,B0,C0] 

[A0,B0,C0 1 E0,lW)0] 

[A0,B0 l C0 ) E0,lW)0] 

[A] 

bytes) 

[A,B : = C,C0] 

[A.B : = C,C0] 


Idnl MOVE.32 {A} + DISP,A 

load non local (DISP = 4«...n ) 

stnl MOVE.32’ B,{A} + DISP 

store non local (DISP = 4*...n ) 

Idnlp MOVE.32 ' #{A) + DISP,A 

load non local pointer (equivalent to ADO.32 #DISP,A) 

lb MOVE. 8 {A}, A 

load byte (A Is extended with zeros, DISP = .. n ) 

sb MOVE.8 B,{A) 

store byte 

outword MOVE. 3 2 A,4*{B) 

output word (8 points to a channel) 

outbyte MOVE. 8 A,{B) 

output byte (8 points to a channel) 

Idpl MOVE. 3 2 #{I} + {A},A 

load pointer to instruction (ADD. 32 I .A) (offset in A measured in 

bsub MOVE.32 Jdf{A} + {B},A 

byte subscript (ADD.32 B,A) 

wsub MOVE.32 *f{A} + 4*{B},A 

word subscript (ADD,32 B«,A 


workspace (as in Figure 3). The Return instruction 
restores the instruction pointer and adjusts the 
workspace pointer to deallocate the four locations. 
Registers A, B, and C are neither restored nor modi¬ 
fied. The General Call instruction ( gcall) interchanges 
register A and instruction pointer I. This interchange 
enables the entry point of a procedure to be computed 


by one or more instructions in register A before the 
General Call instruction is executed. 

Transfer instructions. These instructions enable vari¬ 
ables and pointers to be fetched or stored (Figure 9). 
There are two basic types: local (with respect to the 
workspace pointer) and nonlocal (with respect to regis- 


68 IEEE MICRO 























































































2:4 |:A F/4A 


FB 


7| F/7 
F/B 


move REP {A} MOVE.8 {C+},{B+} 

move message (Occam v1:=v2, block is (A) byte long) 

in REP {A} MOVE.8 {BJ.CC + J 

input message (Occam: chan ? var) 

out REP {A} MOVE.8 {C + },{B} 

output message (Occam: chan I exp) 


[A0,B0,C0] 

[A0,B0,C0,E0] 

[A0,B0,C0,E0] 


Figure 10. Move Block instructions. 


1° 

F/0 

rev 

reverse 

EX. 3 2 

A,B 

[A,B] 

|3|§C 

F/3C 

ga/w 

EX.32 

A, W 

[W,A] 

Note: 

. 32 means 32 

general adjust workspace 

-bit unsigned content .A32 means 32-bit signed (2-s complement) content 

23 FA 

F/3 A 

xword 

CONV 

B.A.A32 

[A,B : = C,C0] 




extend to 

word (if B < A, A : = 

B, otherwise A := (B - 2*A) ) 


I 1 11° 

F/1 D 

xdble 

CONV 

A.A3 2.BA. A64 

[B : = -1 if A<0,C 
[B : = 0 if A>0,C 




extend to 

double (extend A into A and B) 

2 4}l 

fe 

F/4C 

csngl 

CONV 

AB.A64.A.A32 

1) OR (A>0 and B^B) E : = 1 , 

[B : = C,C0,E] 




convert single (if (A<0 and 

otherwise unchanged) 

?3f| 

14 

F/3 4 

bent 

SL.32 

XT2.A 

[A] 




byte count 

(MUL.32 #4,A) 



2 3FF 

F/3F 

went 

ASR.32 

#2, A 

[A,B,C : = B] 


word count (B gets the two low bits of A) (DIV.32 #4,A) 


Figure 11. Exchange and Format Conversion instructions. 

ter A). Many of these instructions are similar to those 
found on standard microprocessors. For instance, the 
Load Constant instruction (Idc) pushes a constant onto 
register A. The Minimum Integer instruction (mint) 
pushes the most negative address Minlnt onto register 
A. The value of Minlnt is equal to H’80000000 on the 
T414. 

The Load Local, Load Local Pointer, and Store 
Local instructions (Idl, Idlp, and stl) access locations 
relative to register W. A local variable can be placed on 
the evaluation stack by the Load Local instruction. Its 
address is placed on the stack by the Load Local 
Pointer instruction. The Store Local instruction stores 
the value at the top of the stack back into the variable. 
The instructions Load Nonlocal, Store Nonlocal, and 
Load Nonlocal Pointer (Idnl, stnl, and Idnlp) are 
similar to the local instructions except they are relative 
to register A, not register W, which allows a level of in¬ 
direction. In the case of the Store Nonlocal instruction, 
the value in register B is stored at the address found in 
register A. Instructions Load Byte and Store Byte (lb 
and sb ) are used for byte transfers. In multiple trans¬ 
fers, Output Word and Output Byte instructions (out- 
word and outbyte) communicate a single word and a 
single byte in register A through the channel pointed to 
by register B (referred to as the B channel), using the 
first word of the workspace. The Load Pointer to In¬ 
struction instruction (Idpi) pushes the address of a loca¬ 
tion in a program onto the stack. 

The Byte Subscript instruction (bsub) returns the 
sum of B and A register contents into register A without 


any overflow check. The Word Subscript instruction 
(wsub) does the same, but first they both multiply the 
contents of register B by four (on the T414). These two 
instructions interpret register A as the address of the 
beginning of a data structure. The result of these in¬ 
structions leaves the address of the byte—which is the 
byte or word content of register B (referred to as B bytes 
or words) from the beginning of the structure—in regis¬ 
ter A. 

Move Block instructions. These instructions allow a 
block of elements to be moved from one area to another 
(Figure 10). The Move instruction described earlier 
moves the number of data bytes found in the A register 
starting at the address pointed to by register C and 
finishing at the address pointed to by register B (the 
blocks must not overlap). The In and Out instructions 
(in and out) communicate data. The In instruction 
transfers a block of bytes from the B-channel address to 
the address pointed to by register C. The Out instruc¬ 
tion transfers A bytes from the address pointed to by 
register C to the B-channel address. The processes at 
either end of a communication should have the same 
value in register A. 

Exchange and format conversion instructions. These 
instructions change the format of data and exchanging 
registers (Figure 11). The Reverse instruction exchanges 
the contents of registers A and B. The General Adjust 
Workspace instruction (gajw) exchanges the W and A 
registers, allowing dynamic allocation of workspaces as 


June 1989 69 







































T414 instruction set 



Sit 

.—■■fl 

i 5 

2 

5 §2 


2 

1 16 


2 

3|f 7 


FC 

| 4 

2 

3 F8 


Mir 


2 5|F3 


I 8 

2 

3 F 1 


2 

2FC 


2 

w 


2 

llfA 


li 


8 

adc 

ADD. A3 2 AfVAL.A 

[A,E] 


add constant 

(E set If overflow, otherwise unchanged) (\/AL = 


F/5 

add 

ADD.32 B,A 

[A,B : = C,C0,E] 


add (E set if overflow, otherwise unchanged) 


F/5 2 

sum 

ADDn.32 B,A 

[A,B := C,C0] 


sum (E not Influenced) — use bsub -- 


F/1 6 

ladd 

ADDC.32 B,A 

[A,B0,C0,E] 


long add A: = 

A*B*C (Carry is low bit of C, E set is overflow, 

unchanged otherwise) 

F/3 7 

Isum 

ADDCn.32 B,A 

[A,B,C0] 


long sum A: = 

A*B*C (Carry is low bit of C, B gets the resulting carry) 

F/C 

sub 

ISUB.32 B,A 

[A,B : = C,C0,E] 


subtract (A: 

= B-A, E set if overflow, otherwise unchanged) 


F/4 

dlff 

ISUBn.32 B,A 

[A,B : = C,C0] 


difference (A 

; = B-A) 


F/3 8 

Isub 

ISUBC.32 B,A 

[A,B0,C0,E] 


long subtract 

A: = B-A-C (borrow is low bit of C, E set if overflow, otherwise unchanged) 

F/4F 

Idiff 

ISUBCn.32 B,A 

[A,B,C0] 


long subtract 

A: = B-A-C (borrow is low bit of C, B gets resulting borrow) 

F/5 3 

mul 

MUL.32 B,A 

[A,B : = C,C0,E] 


multiply (E set if overflow, otherwise unchanged) 


F/8 

prod 

MULn.32 B,A 

[A,B: = C,C0] 


product (faster if small operand in A) 


F/3 1 

Imul 

MULn.32 B,BA 

[A,B,C0] 


long multiply (BA: = A«B*C, B gets resulting carry) 


F/2C 

dlv 

DIV.32 B,A 

[A,B : = C,C0,E=1 ] 


divide (quotient; if overflow or A=B, E : = 1 and A unchanged, 

otherwise E unchanged) 

F/1 F 

rem 

REM.32 B,A 

[A,B : = C,C0,E=1 ] 


remainder (rest; if overflow or A=0, E :=1 and A unchanged, 

otherwise E unchanged) 

F/1 A 

Idlv 

DIV.64 CB,A 

[A,B 1 C0,E] 


long divide (divide CB by A, B gets remainder, E set if invalid, otherwise unchanged) 

ajw ADD.32 #DISP,W [W] 

adjust workspace (DISP = 4*. . .n ) 


Figure 12. Arithmetic instructions. 


well as dynamic switching between existing work¬ 
spaces. The Extend to Word ( xword) instruction sign- 
extends a part-word value to a single-word value. The 
two operands of the instruction are a part-word in reg¬ 
ister B and a length specified in register A by the most 
negative integer representable by the part-word. For in¬ 
stance, to sign-extend a byte in register A to 32 bits, one 
first loads constant H’80 into the evaluation stack (ldc 
H’80, code 2840), and then executes xword (code 
23FA). The Extend to Double instruction ( xdble ) ex¬ 
tends the single-length signed value in register A into a 
double-length signed value in registers A and B (most 
significant part in B). However, the Convert Single in¬ 
struction (csngl) does the reverse and sets the error flag 
if an overflow condition exists. 

The Byte Count instruction (bent) multiplies register 
A by the number of bytes in a word (for example, a 
32-bit machine multiplied by 4). 

A pointer has two parts: a word address and a byte 
selector. The number of bits required to represent the 
byte selector depends on the word length (for example, 
1 bit for a 16-bit word length, 2 bits for a 32-bit word 


length). The Word Count instruction (went) decom¬ 
poses an address into its component word part and byte 
selector. A word offset goes into register A, a byte selec¬ 
tor into B. 

Arithmetic instructions. The Add Constant instruc¬ 
tion (adc) allows a constant to be added to register A 
and checks for overflow conditions (Figure 12). The 
other arithmetic instructions can divide into two basic 
sets: single- and multiple-length (longer than a word) 
instructions. Multiple lengths are identified by the letter 
/ (for long) at the beginning of each instruction. These 
two sets can further divide into two more sets: those 
that check for error (overflow)— add, sub, mul, div, 
rem, ladd, Isub, Imul, and Idiv —and those that ignore 
carry and overflow— sum, diff, prod, Isum, and Idiff. 
For single-length operations, the left-hand operand is 
taken into register B and the right-hand operand into 
A, with the result placed in A. In multiple-length opera¬ 
tions, two single-word operands are in registers A and B 
and a carry (or borrow) operand is in the LSB of C. The 
CALM mnemonic ISUB means that the operands are in¬ 
verted with respect to the other microprocessors like the 


70 IEEE MICRO 

















































































MC68020, NS32032, or VAX. ISUBB,A computes A = 
B- A while the usual SUBB,A computes A = A-B. 

The Adjust Workspace instruction ( ajw ) adjusts the 
value of the workspace pointer by the number of words 
in its operand value. A negative number allocates more 
space. 

Compare instructions. These instructions compare 
and check values on the evaluation stack (Figure 13). 
Since there is no flag register, the results of these opera¬ 
tions are put on the evaluation stack. The value true is 
represented by 1, the value false by 0. The Equals Con¬ 
stant instruction (eqc) sets register A to true if A is equal 
to the instruction operand. Otherwise, it sets it to false. 
The Greater Than instruction (gt) sets A to true if B > 
A. If not, it sets it to false. The Check Word instruction 
( cword) checks whether a single word can be 
represented by a part-word. As with xword, the single¬ 
length word is in register B and the part-word length in 
A. The error flag is set if it does not fit. For instance, to 
check whether the contents of register A is a 12-bit value 


or not, one pushes H’800 onto A (ldc H’800, code 
282040) and executes cword (code 25F6). These two in¬ 
structions do not modify numbers initially in registers 
A and B. 

The Check Subscript instruction ( csubO ) sets the E 
flag if the unsigned value in register B is greater than or 
equal to the value in register A. The Check Count in¬ 
struction ( ccntl ) sets the E flag if B = 0 or is greater 
than A (that is, if B is between 1 and the contents of reg¬ 
ister A). This instruction can check whether the count 
of an input or output instruction is positive or is less 
than the number of bytes in the message buffer. 

Logical instructions. All logical operations (Figure 
14) are performed on the evaluation stack. The instruc¬ 
tions Logical And, Logical Or, and Exclusive Or (and, 
or, and xor) are bit-wise operations on registers A and 
B, with the result left in A. The Bit-wise Not instruction 
(not) has only one operand, which is in register A. The 
Shift Left and Shift Right instructions (shl and shr) 
displace the operand in register B by the number of bits 


iii Q 

l 9 

C 

F/9 

eqc COMP. 3 2 #VAL,A 

equals constant (if EQ, A: = 1, otherwise A: = 0) 

gt COMP.A32 B,A 

greater than (if B>A A:=1, otherwise A: = 0) 

[A] 

[A,B : = C,C0] 

PIP 

F/56 

cword 

CHECK,VS B,A 

[A :=B,B := C,C0,E] 



check word in B according to format in A (E set if (Bs^A) OR (B< (-A) ) , otherwise unchanged) 

2 l|| 3 

F/1 3 

csubO 

CHECK,LS B.A 

[A : = B,B := C,C0,E) 



check subsript from 0 (E set if A ^ B (unsigned integers), 

unchanged otherwise) 

2 4|FD 

F/4D 

ccntl 

CHECK,EQ B,A 

[A : = B,B : = C,C0,E] 



check count 

from 1 (E set if B = 0 OR A < B (unsigned integers), unchanged otherwise) 


Figure 13. Compare instructions. 


2 4|F6 

F/46 

and 

AND.32 B,A 

[A,B : = C,C0] 



logical and (Occam /\) 


PIP 

F/4B 

or 

OR.32 B,A 

[A,B : = C,C0] 



logical or 

(Occam \/) 


PjP 

F/3 3 

xor 

XOR.32 B.A 

[A,B : = C,C0] 



exclusive 

or (Occam ><) 


1 3|r 2 

F/3 2 

not 

NOT.32 A 

[A] 



bitwise not (Occam ') 


2 4|F 1 

F/4 1 

shl 

SL.32 A,B 

[A,B : = C,C0] 



shift left 

(shift by A locations, if 0 A < 32, 

effect undefined otherwise) 

|4|i0 

F/40 

shr 

SR.32 A,B 

[A,B : = C,C0] 



shift right (shift by A locations, if 0 ***» A < 32, 

effect undefined otherwise) 

PPP 

F/3 6 

Ishl 

SL.64 A,CB 

[A,B,C0] 



long shift left (CB shifted by A, result in BA) 


2 3|F5 

F/3 5 

Ishr 

SR.64 A,CB 

[A,B,C0] 



long shift right (CB shifted by A, result in BA) 


2 iff 9 

F/1 9 

norm 

NORM BA 

[A,B,C] 


normalise (shift left until MSB is one, C gets shift count) 


Figure 14. Logical instructions. 


June 1989 71 












































T414 instruction set 


F/2 9 

testerr 

TCLR 

E 

[A,B : = A,C : = B,E : = 0] 


test error false and clear (If E 

= 1, A : = 0 , if E=0, A : = 1) 


F/5 5 

stoperr 

STOPERR 


[A0,B0,C0,E:=1] 


stop on error (if E=1, HOVE I,(W)*4, process stops, next process 

selected; if E=0, continue) 

F/1 0 

seterr 

SET 

E 

[E := 1 ] 


set error (sets error flag) 



F/5 7 

clrhalterr 

CLR 

H 

[H := 0] 


clear halt-on 

-error 

(clears H flag) 


F/5 8 

sethalterr 

SET 

H 

[H := 1 ] 


set halt-on-error (sets H flag) 



F/5 9 

testhalterr 

TEST 

H 

[A,B : = A,C : = B] 


test halt-on- 

error (A := 1 if H 

= 1 , A : = 0 if H=0) 


F/2 A 

testpranal 

TEST 

FlagAnal 

[A,B=A,C=B] 


test processor analysing (A = 1 

if analysed, A = 8 if reset) 


nstructions. 




F/3 9 

runp 

MOVE. 3 2 

A,{BPtrRegi-2} 

[A0,B0,C0] 



MOVE.32 

A.BPtrRegi 



run process 

(the process with v 

workspace at A is added to appropriate queue) 

F/D 

startp 

MOVE.32 

{l} + {B},{A}-1 

[A0,B0,C0] 



MOVE.32 

A,{BPtrRegi-2) 




MOVE. 3 2 

A.BPtrRegi 



start process (process B bytes 

from / with workspace at A is placed 

on current priority queue) 

F/3 

endp 


COMP.32 #\ ,{A}+1 

[A0,B0,C0] 




JUMP.NE NEXT$ 





JUMP (A) 




NEXT$: 

DEC.32 (A}+1 





JUMP (FPtrRegi) 



end process 

(if count p* 1 process terminates, otherwise new process selected) 

F/1 5 

stopp 

MOVE.32 

l,{W)-1 

[A0,B0,C0] 



JUMP 

(FPtrRegi) 



stop process 

(process stops execution, new process selected for execution) 

F/1 E 

Idprl 

MOVE. 8 

current priority,A 

[A,B:=A,C:=B] 


load priority (0 for high priority, 1 for low priority) 


F/1 2 

resetch 

MOVE. 3 2 

(A),A 

[A] 



MOVE. 3 2 

xrMinlntjA) 



ID 


2 1 FE 


reset channel (it A points to the link channel, the link hardware is reset) 


Figure 16. Process-manipulation instructions. 


specified in A, or A bits. Vacated bits are set to zero. 
Long Shift Left and Long Shift Right (Ishl and Ishr) 
shift A bits the double-length value held in registers B 
and C (the most significant word, or MSW, in C). The 
result remains in registers A and B (the MSW in B). The 
Normalize instruction {norm) performs floating-point 
arithmetic and normalizes a double-length value in reg¬ 
isters A and B (the MSW in B). The value in register AB 
shifts to the left until the most significant bit (MSB) of 
the value is 1. The number of bits shifted remains in reg¬ 
ister C. It is better to use a T800 because of the tricky 
T414 floating-point routines. 

Error and Halt flag control. The error flag E must be 
copied into register A to be checked by software. The 


Test Error ( testerr ) instruction puts a false value (value 
0) onto the evaluation stack (that is, register A) if E is 
set. Otherwise, it places a true value on the stack and 
clears the E flag (Figure 15). The Stop on Error 
(stoperr ) instruction removes the current process from 
the schedule if the E flag is set. Other instructions 
{seterr, clrhalterr, and sethalterr) set the error flag and 
clear or set the halt-on-error flag. The Test Halt-on- 
Error {testhalterr) instruction pushes the value of the 
halt-on-error flag onto the evaluation stack (without 
any inversion). The Test Processor Analyzing instruc¬ 
tion {testpranal) tests whether a processor has been 
analyzed or reset. It puts a true value into register A if 
the processor is analyzed, a false value if it is reset. 


72 IEEE MICRO 


































Process manipulation. As mentioned, a transputer 
can run more than one process by using the on-chip 
kernel. This set of instructions (Figure 16) adds or 
removes processes from the schedule when operating in 
this mode. The Load Priority instruction ( Idpri ) loads 
the priority of the current process onto the evaluation 
stack. The Start Process and End Process instructions 
(startp and endp) perform process initiation and ter¬ 
mination. The Start Process instruction initiates a new 
process (that is, initializes a new workspace) at the cur¬ 
rent priority level. Register B contains the offset from 
the current instruction to the new process. Register A 
has the address of the workspace of the new process. 
The End Process instruction synchronizes the termina¬ 
tion of the current process. The process that started the 
other concurrent processes keeps a count in the address 
[A] + 1 of the number of processes still to terminate. 
Register A contains the workspace address of the ini¬ 
tiating process. 

The End Process instruction also checks the number 
of concurrent processes still to terminate. If just one ex¬ 
ists, control passes back to the initiating process 
(workspace pointed to by register A). If more than one 
process is occurring, the number of concurrent pro¬ 
cesses is decremented at the address [AJ + 1, and the 
process at the front of the schedule list is taken. The 
Run Process instruction ( runp ) starts a process. 
Register A contains the process descriptor of an existing 
process, which should point to a workspace in which 
location W-l (see Figure 5) contains the value to be 
loaded into instruction pointer I when the process is 
scheduled. The priority level is determined by the LSB 
of register A and can be set before the Run Process in¬ 
struction is executed. The Stop Process instruction 
(stopp) halts the current process that is saving instruc¬ 
tion pointer I in the workspace. The Reset Channel in¬ 
struction ( resetch ) allows a channel to be reset. Register 
A contains the address of the channel. The channel pro¬ 


cess word is returned to register A and reset to the value 
Minlnt. If the value returned to register A is equal to 
Minlnt, the communication finishes successfully. If 
not, register A contains the process identification num¬ 
ber of the channel. This identification can be used by a 
Run Process instruction to restart the process. If reg¬ 
ister A was pointing to a link, that hardware would be 
reset. 

Alternatives. Several instructions implement the Oc¬ 
cam ALT constructs and its associated instructions on 
the transputer (Figure 17). An alternative construct 
selects one of its component alternatives. The selection 
of the alternative is performed by 

• an Alternative instruction (alt), 

• a sequence of Enable instructions (one for each 
alternative), 

• an Alternative Wait instruction ( altwt ), 

• a sequence of Disable instructions (one for each 
alternative), and 

• an Alternative End instruction (attend). 

The first ready alternative to be disabled is selected. The 
Enable instructions (enbs or enbc) are conditional in¬ 
structions that test a Boolean value in register A named 
process guard (condition for execution). These instruc¬ 
tions also set the process-ready flag in the workspace. 
The channel component for an enable channel is passed 
into register B. If register A is true, the guard is enabled. 
The Disable instructions (diss and disc) have the offset 
from the start of the instruction following the Alter¬ 
native End to the start of the code for that branch of the 
alternative in register A. The tested condition is in regis¬ 
ter B, with the channel component in C. The instruc¬ 
tions return a true value into register A if that branch of 
the alternative is the one that has been selected. We pro¬ 
vide more details and examples on alternatives and par¬ 
allel processes in an upcoming publication. 7 


p)pl F/43 


|4|;4| F/44 


Z4[g5| F/4 5 


aft MOVE.32 #Minlnt+1,{W}-3*4 

alt start (store flag to show enabling is occuring) 

altwt MOVE.32 #-1,{W) 

alt wait (process is descheduled unless (tJ)-3*4=HinInt*3) 

attend MOVE.32 {l) + {{W)),l 

alt end (set / to first instruction of branch selected) 


[A0,B0,C0,E0] 

[A0,B0,C0] 


|4||9| F/4 9 


|3|;o| F/3 0 


PP1 F/4 8 


|2|jr| F/2 F 


enbs MOVE.32.AEO #Minlnt+3,{W}-3*4 [A0] 

enable skip (If A=1, HOVE.32 Hinlnt»3,(U)-3-4) 

dlss MOVE.32.BEQ A,{W) [A,B:=C,C0] 

disable skip (if B=1 AND the first ready process, execute the move and A: = 1, A: = 0 otherwise) 

enbc MOVE.32.AEQ #Minlnt+3,{W}-3*4 [A0,B0,C0] 

enable channel (if A = 1 AND another process using channel, execute move, A: =0 otherwise) 

disc MOVE.32.BEQ A,{W) [A,B0,C0] 

disable channel (if B=1 AND the first ready process, execute the move and At = 1, A: = 0 otherwise) 


Figure 17. Alternative-construct instructions. 


June 1989 73 
























T414 instruction set 


2 2 |r 2 F/2 2 


22|FB F/2 B 


Idtlmer MOVE. 3 2 TIMER, A 

load timer (A = value of current priority level clock) 

tin COMP.32 TIMER,A 

timer input (continue execution only when ClockRegi>A) 


[A,B: = A,C: = B] 
[A0,B0,C0,E0,H0] 


2 4[FE 


F/4E 


% 5[F 1 

PIP 


F/5 1 
F/4 7 


12 ||j E 
15|F4 


F/2 E 
F/5 4 


talt MOVE.32 ^Minlnt+1 ,{W}-3*4 

MOVE.32 AfMinlnt+2,{W}-4*4 

timer alt start ((Ul)-3 = state, {U)-4 = tlink) 

taltwt MOVE. 32 tf-l.tw} [A0,B0,C0,E0,H0] 

timer alt wait (process descheduled unless state = Minlnt*3, or tlink = Minint♦ 1) 

enbt MOVE.32.AEQ #Minlnt+3,{W}-3*4 [A,B:=C,C0] 

MOVE.32.AEQ #Minlnt+1 ,{W}-4*4 
MOVE.32.AEQ B,{W}-5*4 

enable timer (if A=1 AND time AFTER B then newtime = B, otherwise newtime = time) 

dist MOVE.32.BEQ A,{W} [A,B0,C0] 

disable timer (if B=1 AND time later than guard AND first ready process, execute the move and A: = 1, A: = 0 otherwise) 

sttimer MOVE. 32 A.CIockRegO [A:=B,B:=C,C0] 

MOVE.32 A,ClockRegi 
Start clocks 

store timer 


Figure 


18. Timer instructions. 




|l (1 8 

F/1 8 

sthf 

MOVE. 3 2 

A.FPtrRegO 

[A:=B,B:=C,C0] 



store 

high priority front pointer 



2 1 |F C 

F/1 C 

stlf 

MOVE. 3 2 

A.FPtrRegl 

[A:=B,B:=C,C0] 



store 

low priority front pointer 



2S|FO 

F/5 0 

sthb 

MOVE. 3 2 

A.BPtrRegO 

[A:=B,B:=C,C0] 



store 

high priority back pointer 



2 1 |F 7 

F/1 7 

stlb 

MOVE. 3 2 

A.BPtrRegl 

[A:=B,B:=C,C0] 



store 

low priority back pointer 



2 3|f E 

F/3E 

saveh MOVE. 3 2 

FPtrRegO,{A} 

[A:=B,B:=C,C0] 



save 

MOVE.32 BPtrRegO,{A) + 4 

high priority queue registers 


2 3|FD 

F/3D 

save 

MOVE. 3 2 

FPtrRegl, {A) 

[A:=B,B:=C,C0] 




MOVE.32 

BPtrRegl ,{A} + 4 



save low priority queue registers 


Figure 19. List-pointer instructions. 


Timers. The transputer has two 32-bit, free-running 
clocks that give it a measure of time. Timer Alternate 
Start, Timer Alternate Wait, Enable Timer, and 
Disable Timer instructions {talt, taltwt, enbt, and dist) 
are used with alternative instructions when time is in¬ 
troduced into one or more of the guards (Figure 18). 
The uses and effects of these instructions are similar to 
those described in the section on process manipulation. 
The Load Timer instruction ( Idtimer ) reads the value of 


the current priority clock and places its value onto the 
evaluation stack. Time delays are possible by using the 
Timer Input instruction. The A register is set to a cer¬ 
tain time. If the time in A has passed, (Inmos uses the 
notation ClockReg AFTER A), the instruction has no 
effect. If it is in the future (A AFTER ClockReg or A = 
ClockReg), the process is removed from the schedule. 
Unsigned modulo arithmetic operations must be used 
to prevent errors when the clock wraps around from the 


74 IEEE MICRO 
























most positive to the most negative value. The Store 
Timer instruction ( sttimer) stores the value from regis¬ 
ter A into both clock registers (ClockRegO and 
ClockRegl) and then starts both clocks. 

Pointers. The registers for the construction of linked 
lists appeared in Figure 2 (FPtrRegO, FPtrRegl, BPtr- 
RegO, and BPtrRegl). The following instructions 
manipulate these registers (Figure 19). A set of instruc¬ 
tions load these registers: sthf, stlf, sthb, and stlb. 
These instructions store (s) the value in register A into 
the appropriate register (h for high priority, / low,/for 
front pointer, b for back). The Save High- and Low- 
Priority Queue Registers instructions (saveh and savel) 
save the front and back pointers in two consecutive 
locations, the first pointed to by register A. Again, h is 
high priority, and / is low priority. 


T his article has attempted to illustrate the internal 
workings of the T414 transputer using standard 
microprocessor notations. We provide more 
details and program examples elsewhere to explain the 
last four groups of instructions, which are rather 
tricky. 7 

The transputer is not an ordinary microprocessor. It 
takes some concepts from RISCs and also uses a stack 
for evaluation of arithmetic and logical operations 
rather than registers. Most importantly, it has been 
designed to work in a concurrent environment. Conse¬ 
quently, the transputer has a number of complex and 
cumbersome instructions not normally found in other 
microprocessors. However, these instructions can be 
rather clearly expressed by a set of familiar instructions. 


References 

1. M. Homewood et al., “The IMS T800 Transputer,” 
IEEE Micro, Vol. 7, No. 5, Oct. 1987, pp. 10-26. 

2. IMS T4I4 Transputer Reference Manual, Inmos Ltd., 
Bristol, England, 1984. 

3. J.D. Nicoud and F. Wagner, Major Microprocessors: A 
Unified Approach Using CALM, North Holland Press, 
Amsterdam, 1987. 

4. J.D. Nicoud and P. Fah, CALM Standard and Explana¬ 
tions, 2nd ed., Laboratoire de Microinformatique, Ecole 
Polytechnique Federate de Lausanne, Lausanne, 
Switzerland, Dec. 1986. 

5. The Transputer Instruction Set—A Compiler Writer’s 
Guide, Inmos Ltd., Bristol, England, 1987. 

6. C. Plumb,“An Introduction to Transputer Assembly 
Language,” Electronic paper broadcast on UUCPnet/Bit- 
net from ccplumb@watmath.waterloo.edu, Jan. 1989. 

7. A.M. Tyrrell and J.D. Nicoud, “Scheduling and Parallel 
Operations on the Transputer,” to be published in Proc. 
Euromicro Conf, North Holland Press, 1989. 



Jean-Daniel Nicoud is a professor at the Ecole Polytechnique 
Federale de Lausanne in Switzerland. He has been active in 
microprocessor-related research for 16 years and has designed 
many microprocessor-based systems. His interests and those 
of his research group include microcomputer design, devel¬ 
opment tools, local networks, microcomputer peripherals, 
neural networks, and very large scale integration technology. 

Nicoud holds a degree in engineering physics and a PhD 
degree in electrical engineering, both from the Ecole Polytech¬ 
nique Federale de Lausanne. He was an associate editor of 
IEEE Micro from its inception in 1981 until 1985. He is a 
member of the IEEE Computer Society. 



Andrew Martin Tyrrell is a senior lecturer in the Department 
of Electrical, Electronic, and Systems Engineering at Coven¬ 
try Polytechnic, Coventry, England. While on leave in 1988, 
he worked as a research fellow with J.D. Nicoud on mul¬ 
tiprocessor systems at EPFL. His research interests are fault- 
tolerant software, distributed processor systems, benchmark¬ 
ing for multiprocessor systems, models for concurrent sys¬ 
tems, and topologies for image-processing applications. 

Tyrrell received a degree in electronic engineering from 
Bolton Institute of Technology, Bolton, England. He per¬ 
formed graduate research at Aston University, Birmingham, 
England, in a program for the design of fault-tolerant, loosely 
coupled, distributed systems. He is a member of the IEE and 
the IEEE. 

Questions about this article can be directed to J.D. Nicoud 
at Ecole Polytechnique Federale de Lausanne, Laboratoire de 
Microinformatique (EPFLrLAMI), Av. de Cour, 37, CH-1007 
Lausanne, Switzerland. 


Reader Interest Survey 

Indicate your interest in this article by circling the 
appropriate number on the Reader Interest Card. 

Low 162 Medium 163 High 164 


June 1989 75 















SPECIAL FEATURE 


A Logical Design Tool for 
Relational Databases 


Here's a way 
to prevent 
anomalies 
from affecting 
the design of 
your relational 
database. 


M. Mehdi Owrang O. 

W. Gamini Gunaratna 

The American University 


N ew trends in software development point to the automation of 
various phases of the software life cycle, 1 particularly the design 
phase. A typical example is computer-aided software engineering 
technology, 2 4 which provides tools to assist in software development. In the 
past few years, interest has grown in the development of database-manage¬ 
ment systems (DBMSs) based on the relational model 5-7 (see the accompa¬ 
nying box for definitions and notations). 

Systems for managing large-scale databases under the relational model 
(for example, Oracle 8 ) have become commercially available, which makes 
the value of a design tool for the relational model obvious. This fact 
motivated us to implement the Logical Design Tool (LDT), which can also 
be used as an educational tool for the relational database. Using this tool 
during database design can detect certain problems that arise when the data 
is manipulated. An example is a cycle in a schema 67 ' 9 that leads to ambigu¬ 
ous interpretation of queries. The major problem, however, is to come up 
with a particular normal-form design (such as a third normal form) through 
the process of decomposition. Decomposing a relation scheme into several 
relation schemes prevents the occurrence of anomalies. 5-7 Consider the 
following Suppliers relation scheme taken from Ullman. 7 

SUPPLIERS (SNAME, SADDRESS, ITEM, PRICE) 

We can see several problems with this scheme: 

•Redundancy. The address of the supplier repeats each time an item is 
supplied. 

•Potential inconsistency (update anomalies). Because of the redun¬ 
dancy, users can update the address for a supplier in one tuple and leave it 
fixed in another. Thus, users may not have a unique address for each 
supplier as they feel intuitively that they should. 

•Insertion anomalies. Users cannot record an address for a supplier if 
that supplier does not currently provide at least one item. Null values can be 
used for the Item and Price components of a tuple for that supplier. 
However, when users enter an item for that supplier, they may not remember 
to delete the tuple with the null values. In addition, if Item and Sname form 
a key for the relation, it might be awkward or impossible to look up tuples 
with null values in the key. 

•Deletion anomalies. If users delete all the items supplied by one 
supplier, they unintentionally lose track of its address. 


76 IEEE MICRO 


0272-1732/89/0600-0076$01.00 © 1989 IEEE 










The Relational Database Model 


Here we present some brief definitions and nota¬ 
tions of the relational database used in this article. 

Definitions 

Conceptually, a table can represent a relation. 
Each row represents a tuple, and each column repre¬ 
sents an attribute. The set of attribute names for a 
relation is called the relation scheme. 5 ' 1 - 10 In Table 
A, we see a relation whose attributes are S#, Sname, 
Status, and City. The tuple is {SI, Smith, 20, Lon¬ 
don). The relation scheme for this relation is SUP¬ 
PLIERS (S#, Sname, Status, City). Any set of one or 
more attributes that uniquely identifies the tuples is 
called a candidate key (like S#), and the attribute 
selected for the relation is called the primary key (or 
just key). A relational schema is composed of a set 
of relation schemes. 5 ' 710 

Data dependencies 

Data dependencies are the semantic constraints 
imposed on data. Recognizing the different types of 
dependencies is part of the process of understanding 
what the data means. 5 ' 710 

Functional dependency (FD). Given a relation 
R, attribute Y of R is functionally dependent on 
attribute X, denoted as X —> T, if and only if each X 
value in R has associated with it precisely one Y 
value in R. For example, attributes Sname, Status, 
and City of relation SUPPLIERS in Table A are each 
functionally dependent on attribute S#, because, 
given a particular value for S#, one corresponding 
value exists precisely for each of the attributes 
Sname, Status, and City in symbols: 

S# —> Sname, S# -> Status, S# —> City 

Multivalued dependency (MVD). A multival¬ 
ued dependency, denoted as X —» Y, means that the 
X value determines a set of Y values. 

Join dependency (JD). Functional and multival¬ 
ued dependencies together capture a significant 
amount of semantic information of the data. For 
example, the FD X -> Y implies that the database 
scheme {XY,XZ} is information that is equivalent to 
the relation scheme XYZ. This information can be 
represented concisely by the join dependency JD 


Table A. 

The Suppliers relation. 

s# 

Sname 

Status 

City 

SI 

Smith 

20 

London 

S2 

Jones 

10 

Paris 

S3 

Blake 

30 

Paris 

S4 

Clark 

20 

London 

S5 

Adams 

30 

Athens 


X [XT, XZ], The set of join dependencies associ¬ 
ated with a scheme R determines exactly those data¬ 
base schemes that can represent R without losing in¬ 
formation. 

Normal forms 

In the theory of database design, 5 ' 710 attributes can 
be assembled into a set of relations in many different 
ways. Some arrangements are “better” than others. 
The ways in which attributes can be arranged are 
called normal forms (that is, third normal form 
(3NF), Boyce-Codd normal form (BCNF), and 
fourth normal form (4NF)). 

The theory behind the arrangement of attributes 
into relations is known as schema normalization. 
Normalization allows us to recognize certain unde¬ 
sirable properties of relations (such as data redun¬ 
dancy or inconsistency) and shows how such rela¬ 
tions can be converted to a more desirable form. 

Third normal form. A relation R is in 3NF if and 
only if the nonkey attributes (attributes that do not 
participate in the key of the relation), if any, are 

• mutually independent (two or more attributes are 
mutually independent if none of the attributes con¬ 
cerned is functionally dependent on any of the oth¬ 
ers); and 

•fully dependent on the key of/? (attribute T of 
relation R is fully functionally dependent on attribute 
X of relation R if it is functionally dependent on X 
and not functionally dependent on any proper subset 
of X). 


IEEE MICRO 77 










LDT 


Bovce-Codd normal form. A relation R is in 
BCNF if and only if every determinant (any attribute 
on which some other attribute is fully functionally 
dependent) is a candidate key. 

Fourth normal form. A relation R is in 4NF if 

and only if—whenever an MVD exists in R, say A 
—» B —all attributes of R are also functionally de¬ 
pendent on A. In other words, the only dependencies 
(FDs or MVDs) in R are of the form K —> X (for 
example, a functional dependency from a candidate 
key K to some other attribute X). 

Decomposition 

Relations are transformed from one normal form 
to another by decomposing them into a set of smaller 
relations. 5-710 The decomposition of a relation 
scheme R = {A |( A v . . . , AJ is its replacement by a 
collection P = [R r R 2 ,..., R k ) of subsets of R such 
that R = R t u R, u,. .., u R t . Such decomposition 
should be lossless, which means that if we join the 

relations r. (/ = 1,2. k) for /?.(/= 1, 2,... , k), 

we obtain the original relation r for R (no informa¬ 
tion is lost). 

Database acyclicity 

We would generally like database schemes to be 
acyclic. 6J - 9 One reason is that a unique connection 
between attributes is desirable. Cyclic schema 
causes some problems in interpreting queries be¬ 
cause multiple paths connect a set of attributes in¬ 
volved in the queries. For example, consider the 



Figure A. Examples of a relational schema (a) and 
a corresponding graph containing a cycle (b). 


banking example of Figure A that represents a cyclic 
schema. If a query requests those banks associated 
with a given customer, it is not clear whether the 
desired information is the set of banks where the 
customer has loans, or the set of banks where the 
customer has accounts, or both. 


This example becomes nonproblematic if we decom¬ 
pose the relation scheme into the following two relation 
schemes. 

SA(SNAME, SADDRESS) 

SIP (SNAME, ITEM, PRICE) 

Note that none of the commercial relational data¬ 
base- management systems (RDBMSs) have any facili¬ 
ties to assist users in designing a logical view of data to 
prevent these anomalies from being built into the actual 
database. A design tool such as the LDT can provide the 
necessary assistance to users so that they can process 
the relation schemes and decompose them when it is 
required. 

78 IEEE MICRO 


The LDT is mainly based on the concept of schema 
normalization. 5 ' 1 ' 0 We have implemented different 
operations in the LDT to assist the designer of the 
relational database to manipulate relation schemes to 
see whether they are in certain normal forms (third, 
Boyce-Codd, or fourth 5 ' 710 ). If not, these operations 
decompose the relation schemes into certain normal 
forms if so desired. In addition, the LDT includes other 
operations that test for both the acyclicity of relation 
schemes 6 - 7 ' 9 and what we call “lossless” join decompo¬ 
sition. 5 ' 710 These operations enable the designer to 
detect other undesirable problems such as a cycle in 
schema that causes ambiguity in the interpretation of 
some queries. 









Overview of the LDT 

We wrote the program in Turbo Pascal for the IBM 
PC AT/XT so that users can easily expand or modify it. 
This window-based, menu-driven package contains 
a help file that can be accessed at any level of the 
program. Users can define both the relation schemes 
and the dependencies. 

The LDT contains a built-in editor to select the 
operations. Users can input data either from the key¬ 
board or an input file. They can also add or delete data 
to and from the schema or dependency tables. The LDT 
has a maximum static limit for the attributes in a 
relation or a dependency. However, removing these 
restrictions does not disturb the structure of the pro¬ 
gram in any way. 

Here we briefly describe the structure and syntax of 
the relations and dependencies used in the LDT. 


Data structure. The first structure is the symbol 
table, SYM_TAB. The symbol recognizer stores each 
attribute name that it identifies in this table as shown in 
Figure 1. It then checks attributes contained in depen¬ 
dencies against the symbol table to ensure the validity 
of the dependencies. 

Figure 2 demonstrates the relation schemes table, 
SCH_TAB. Each relation scheme identified by the 
sentence recognizer is stored in this table. 

Figure 3 shows the data dependencies table, 
DEP_TAB. Each dependency identified by the sen¬ 
tence recognizer is stored in this table. 


External files. The LDT checks two files— schema 
(ssch) and dependency (ddep)—when users first em¬ 
ploy the system. If the files are not empty, the data in 
ssch and ddep is restored to tables SCH_TAB and 
DEP_TAB, respectively. When users quit the system, it 
backs up the data in tables SCH_TAB and DEP_TAB to 
the same files for the next use. 

The syntax for relations and dependencies is as fol¬ 
lows, beginning with relations: 

<legal_id> (<legal_id> [,<legal_id>]) 

example: bank(account, customer_name, customer_add) 

The syntax for a functional dependency is 

FD <legal_id> [,<legal_id>] I <legal_id> [,<legal_id>] 
example: FD account I customer_name, 
customer_address 

The syntax for a multivalued dependency is 

MVD <legal_id> [,<legal_id>] I <legal_id> 

[,<legal_id>] 

example: MVD supplier_num I supplier_name, location 

A join dependency is as follows: 

JD <legal_id> [,<legal_id>] I...I <legal_id>[,<legal_id>] 
example: JD a, b, c I c, d, e I e, f, g 


SYM TAB [i] ::= 


Attribute name 

Next 

Code 


Code Integer (>=0), which represents the attribute name 
Next Pointer (>=0) to the next component 


Figure 1. Symbol table. 


SCHJTAB [i] ::= 


Count 

Relation name 
Attributes 


Attributes Integer codes that represent attribute names 
Count Sequence number of the relation scheme 


Figure 2. Relation schemes table. 


DEP_TAB [i] ::= 


Count 

Dep-type 

Attributes 


Attributes Integer codes that represent attribute names 
Count Sequence number of the data dependency 
Dep-type Type of data dependency 


Figure 3. Data dependencies table. 


Functions of the LDT 

Here we describe functions that can assist the de¬ 
signer to develop a logical view of the relational data¬ 
base. 5 ' 710 In the relational model, data is organized into 
a set of relation schemes and data dependencies that 
describe the interrelationships between the data. As 
stated, in the logical design of a relational database, 
normalizing the relation schemes can prevent anoma¬ 
lies that occur at the time of manipulation of data. 
Normalization is a process of decomposing a relation 
scheme into several relation schemes based on the 
existing data dependencies. Consequently, we have 
considered a major issue in the implementation of the 


June 1989 79 





































LDT 


LDT, namely, the schema-normalization problem. We 
have considered three kinds of dependencies (func¬ 
tional, multivalued, and join) and three normal forms 
(third, Boyce-Codd, and fourth). 

We summarize the functions of the LDT as follows. 
The system 

• detects whether or not a data dependency can be 
inferred from a given set of data dependencies; 

• detects whether or not a given set of relation 
schemes and data dependencies is in the third, Boyce- 
Codd, or fourth normal forms; 

• decomposes a relation scheme into either Boyce- 
Codd or fourth normal forms; 

• detects whether or not a given set of relation 
schemes has a lossless-join property corresponding to 
a set of data dependencies; and 

• detects whether a given set of relation schemes is 
acyclic. 

Figure 4 depicts the overall LDT system diagram. At 
this point, the LDT has 14 commands; four of them 
construct the I/O module. The I/O module is respon¬ 
sible for processing the relation scheme and data de¬ 


pendencies and creating the schema and dependency 
tables. The heart of the LDT is the schema-normaliza¬ 
tion module that enables users to process these tables in 
constructing the relational schema in the desired nor¬ 
mal form. 

To use the normal forms and decomposition func¬ 
tions, users have the option either to employ the rela¬ 
tion schemes and dependencies stored in the ssch and 
ddep files or to provide new sets of them. Note that the 
schema-normalization module uses some other func¬ 
tions such as key finding and chase that are not avail¬ 
able to users. The key-finding 6 - 7 function processes the 
attributes of a relation scheme with the associated 
dependencies to determine a set of candidate keys for 
the scheme. The chase 6,7 function processes the depen¬ 
dencies, attributes, and relation schemes represented as 
a table. It then produces a new table, with respect to the 
dependencies, that can be used by the key-finding and 
lossless-join functions. 

In the other module, we have included the following 
functions to improve the capability of the LDT. 

First, a dependency-inference function enables users 
to eliminate any dependency that is implied by the 



Figure 4. The overall LDT system. 


80 IEEE MICRO 
























































Figure 5. Connection between the LDT and the RDBMS. 


existing dependency. This function also uses the opti¬ 
mum-cover function to define a minimum set (FD S ) that 
have no redundancy. The dependency-inference func¬ 
tion is very useful when the dependencies are input, 
because users add to the dependency table only those 
dependencies that are not implied by existing ones. 

Second, the lossless-join function enables users to 
see whether the decomposition of a relation scheme to 
several relation schemes has the lossless-join property. 
This property guarantees that the original relation will 
not be changed as a result of the decomposition. 

Finally, the acyclicity-testing function enables users 
to test whether a set of relation schemes is acyclic. As 
mentioned, the cyclic schema is not desirable as it 
introduces ambiguity in interpretation of the queries. 
Therefore, it would be very beneficial to test for schema 
acyclicity before the actual database is buii,. 

Application 

As mentioned, the LDT enables the user to manipu¬ 
late the data at the logical level by deriving the desired 
logical view of the data. The output generated by the 
LDT can feed directly into a commercial relational 
database software package like Oracle given the appro¬ 
priate interface between the LDT and the RDBMS. 
Figure 5 illustrates this connection. 

During normal RDBMS operation, users employ the 
create-table operation to generate the relation schemes. 
They then use the insert-into-table operation to add 


In general, 
database schemes 
should be acyclic. 


tuples into the relations. In most cases, users do not 
know whether these schemes are in a particular normal 
form or whether these schemes construct an acyclic 
schema. As a result, users may build a database with a 
potential for anomalies. 

In the new environment in which the RDBMS util¬ 
izes the LDT, users provide the relation schemes and 
dependencies to the LDT. Then they use the system to 
put the relational schema in the desired normal form. 
Upon exiting from the LDT, the relation schemes in the 
desired normal form and the dependencies are saved in 
the schema and dependency files, respectively. The 
RDBMS can use the schema file to create the relation 
schemes. Users can employ the insert-into-tables op¬ 
eration to store the actual tuple in the relations. How¬ 
ever, the RDBMS now can use the dependency table to 
satisfy the dependencies (RDBMSs do not check for 
validity of the tuples to be inserted with respect to the 
dependencies). 


June 1989 81 






























LDT 


The LDT can be implemented 
on mainframes without any 
major modifications. 


It is obvious that enhancing the RDBMS with the 
LDT restricts users in terms of arbitrarily creating and 
deleting the relations. However, this new configuration 
(Figure 5) improves the data integrity within the data¬ 
base because most of the data anomalies that might 
occur at the logical level can be eliminated by using the 
LDT before building the actual database using the 
RDBMS. 

It is worthwhile mentioning that the design of the 
interface between the LDT and the RDBMS in Figure 5 
can easily be applied to any other commercial RDBMSs 
(like IBM’s DB2 and SQL/DS, and Cincon 
Corporation’s Supra 5 ) provided that the LDT functions 
are implemented in the operational environment of the 
RDBMS in use. For instance, the LDT functions can be 
implemented using Standard Pascal on an IBM main¬ 
frame, which can then link to the DB2 relational data¬ 
base system. In general, LDT functions can easily be 
incorporated into a host language (like IBM’s PL/I) as 
a set of calls in the same way as the Data Manipulation 
Language (DML) statements are incorporated when the 
RDBMS package is used as a batch environment. In 
addition, LDT functions can be used independently 
prior to the use of the RDBMS package in an interactive 
environment as described in this section (note that DB2 
supports both batch and interactive environments). 

The process described in this section can provide the 
basis for developing a common interface between the 
LDT and any RDBMS operating on any computer for 
the following reasons: 

•All commercial RDBMS packages operate on 
relations. 

•The structures used in the LDT implementation are 
simple and do not require any special feature that might 
not be supported by some computers. 

•New trends in software development make it pos¬ 
sible to develop a portable software" using such meth¬ 
ods as microcoding. 


Expandability and hardware 

The LDT has been implemented on the IBM PC AT/ 
XT with a minimum memory of 256 Kbytes. We added 
a number of small, device-dependent routines to make 
this program easy to use. Device- and compiler-de¬ 


pendent functions can be modified without changing 
the basic functions of the LDT. Hence, the LDT can be 
implemented on mainframes without making any major 
modifications. 


Examples of functions 

We tried the following examples on the LDT system. 
(Underlined attributes denote the key for the relation.) 

Normal forms checking. Given the relation scheme 

R ( CITY. STATE . ZIP_CODE) with dependencies 
fd CITY, STATE I ZIP_CODE 
fd ZIP_CODE I CITY 

the LDT responds 

: relation R is in third normal form; 

: relation R is not in Boyce-Codd normal form. 

Lossless-join testing. Given decomposition of the 
previous relation scheme R into relation schemes 

R1 (STATE, ZIP_CODE) 

R2 (CITY. ZIP_CODE) 

the LDT responds 

: This is a lossless join. 

Decomposition. Given the following relation 
scheme and dependencies, 

DWELLER ( NAME . ADDRESS, APT#, RENT) 
fd NAME I ADDRESS, APT# 

fd ADDRESS, APT# I RENT 

the LDT responds 

: The relation scheme Dweller is not in third normal 
form. 

Then the LDT decomposes that scheme to the third 
normal form as follows. 

TENANT ( NAME . ADDRESS, APT#) 

APARTMENT ( ADDRESS, APT# . RENT) 

Schema-acyclicity testing. In terms of the relational 
schema for a banking database given earlier in Figure 
A, the LDT generates a message to indicate that the 
schema is cyclic. The attributes Account, Bank, Loan, 
and Customer compose the cycle. Figure A also repre¬ 
sents the graph corresponding to the banking system. 


W e have presented a logical design tool to assist 
designers of logical relational databases. The 
LDT enables the designer to manipulate the 
data at the logical level to derive the desired view of 
the data. The output generated by the LDT can feed 
directly into commercial relational database software 
packages. 


82 IEEE MICRO 


















We are currently expanding the LDT to include 
capabilities for automatic generation of functional and 
multivalued dependencies and automatic conversion of 
a cyclic relational schema to an acyclic schema. An¬ 
other intriguing direction is enhancing the user inter¬ 
face to make the LDT more of an expert system that can 
understand the designer’s context and goals and give 
useful advice about alternative design choices and their 
implications. §§. 


Acknowledgments 

We thank the reviewers for their suggestions, which 
have resulted in several improvements over our origi¬ 
nal manuscript. 


References 

1. S.L. Pfleeger, Software Engineering—The Production of 
Quality Software , MacMillan Co., Inc., New York, 1987. 



M. Mehdi Owrang O. is an assistant professor of Computer 
Science and Information Systems at the American Univer¬ 
sity, Washington, D.C. His research interests include soft¬ 
ware engineering, databases, and query translation in the 
distributed database environment. 

Owrang received the MS and PhD degrees in computer 
science from the University of Oklahoma at Norman. He is a 
member of the Association for Computing Machinery, the 
IEEE, the Sigma Xi (American Scientist Society), and the 
IEEE Computer Society. 


2. A.F. Case, Jr., Information Systems Development Prin¬ 
ciples of Computer Aided Software Engineering, Pren¬ 
tice-Hall, Englewood Cliffs. N.J., 1986. 

3. CASE Quarterly, Vol. 1, No. 1, Apr. 1987. 

4. J. Martine and C. McClure, Structured Techniques—The 
Basis for CASE, Prentice-Hall, 1986. 

5. C.J. Date, An Introduction to Database Systems, Vol. 1, 
4th ed., Addison-Wesley Pub. Co.. Reading, Mass., 1986. 

6. D. Maier, The Theory of Relational Databases, Com¬ 
puter Science Press, Rockville, Md., 1983. 

7. J.D. Ullman, Principles of Database Systems, Computer 
Science Press, 2nd ed., 1982. 

8. ORACLE, Oracle Corp., Belmont. Calif. 

9. C. Beeri et al., “On the Desirability of Acyclic Database 
Schemes,” J. ACM, Vol. 30, No. 3, July 1983, pp. 479- 
513. 

10. E.F. Codd, “Further Normal ization of the Database Rela¬ 
tional Model,” in Current Computer Science Symposia, 
Vol. 6, Data Base Systems, Prentice-Hall, 1971, pp. 33- 
64. 

11. P.J.L. Wallis, Portable Programming, The Macmillan 
Co., London, 1982. 



W. Gamini Gunaratna is currently a research associate for 
the Center for Bio Analytical Research (CBAR) at the Uni¬ 
versity of Kansas. He was a graduate student in computer 
science at the American University when he coauthored this 
article. He has been a guest scientist at the National Bureau 
of Standards, Scientific Computing Division. His research 
interests are mainly in the area of CASE studies and database 
and algorithm optimization techniques. 

Gunaratna received the BSc degree from the University of 
Colombo, Sri Lanka, and the MS degree in computer science 
from the American University, Washington, D.C. 


Questions regarding this article may be directed to Mehdi Owrang, the American University, Department of Computer Science 
and Information Systems, 4400 Massachusetts Ave., N.W., Washington, D.C. 20016. 


Reader Interest Survey 

Indicate your interest in this article by circling the appropriate number on the Reader Service Card. 

Low 165 Medium 166 High 167 


June 1989 83 


















Department 


Micro Law 


Richard H. Stern 
Law Offices of Richard H. Stern 
1300 19th Street AW, Suite 300 
Washington, DC 20036 


Appropriate and inappropriate legal protection 
of user interfaces and screen displays. Part I 


W hether and how user interfaces 
and screen displays for com¬ 
puter programs should be pro¬ 
tected against competitive imitation is 
an area currently generating consider¬ 
able legal controversy. Proponents of 
such protection assert that proprietors of 
these highly valuable adjuncts of com¬ 
puter software need and deserve protec¬ 
tion against copying. Others question 
whether such protection might more hin¬ 
der than promote software progress. 

Those who conclude or assume that 
there should be legal protection for user 
interfaces and screen displays disagree 
over the preferred legal mechanism. 
Those who conclude or assume that the 
mechanism should be the present federal 
copyright laws disagree over the method 
of protection. Should user interfaces and 
screen displays be placed under a gen¬ 
eral umbrella of legal protection ex¬ 
tended to the computer programs to 
which they relate? Or should they in¬ 
stead be protected separately as things 
like pictures? Of major concern to soft¬ 
ware entrepreneurs and software users is 
which aspects of user interfaces and 
screen displays should be reserved in 
the public domain, which should be 
legally protected, and how far the 
protection should go. 

The conflicting opinions and general 
uncertainty led the US Copyright Office, 
the federal agency charged with admini¬ 
stering the copyright system, to solicit 
industry comments. The Office held a 
public hearing on these issues in Sep¬ 
tember 1987 to aid it in setting its pol¬ 
icy. The IEEE Computer Society’s 
Board of Governors passed a resolution 
favoring protection of screen displays 


Resolution 

Resolved, that the Board of 
Governors of the Computer Soci¬ 
ety of the IEEE, in order to 
encourage and promote creativity 
and investment in this aspect of 
the computer graphics field, 
recommends that the Copyright 
Office of the Library of Congress 
should permit an owner of a com¬ 
puter program screen display to 
register the screen for copyright 
protection apart from the registra¬ 
tion of the underlying code in the 
computer program that generates 
the display. 


(see adjacent box) and transmitted it to 
the Copyright Office. 

A panel of witnesses from the Com¬ 
puter Society testified before the Copy¬ 
right Office, presenting a spectrum of 
views. Although different Computer 
Society panelists had different degrees 
of enthusiasm for a regime of legal pro¬ 
tection of screen displays and user inter¬ 
faces, their consensus was that some 
protection was desirable. They also 
agreed that the kind of protection should 
be independent of that for whatever 
computer code or computer program 
was used to generate screen displays. 

The three IEEE Computer Society 
witnesses were S. Levine, A. Peller, and 
myself. Levine expressed apprehension 
lest excessive legal protection stifle 
technological progress. Peller was con¬ 
vinced that the incentives of protection 


would contribute far more socially than 
they could hinder progress. I felt protec¬ 
tion would probably promote progress 
more than hinder it if the Copyright Of¬ 
fice was very careful in how it adminis¬ 
tered the law. 

Despite the possible differences in 
approach taken by the Computer Society 
witnesses, it is fair to say that all of 
them at least tacitly agreed with the 
premise that progress in software and 
software innovation is to be desired. 
That premise also underlies the present 
analysis. Probably these witnesses all 
accepted, as well, the further premise 
that increased public acceptance of com¬ 
puters as a part of everyone’s daily life 
is socially desirable, and that prolifera¬ 
tion of usage of computers and com¬ 
puter software in many aspects of life 
ought to be encouraged as a means to¬ 
ward development of a better society. 
That agenda is also a philosophical 
premise of this analysis. 

After a nine-month gestation period, 
in June 1988 the Copyright Office is¬ 
sued a policy statement governing copy¬ 
right registrations of screen displays. 
Notwithstanding judicial precedent pre¬ 
scribing separate copyright treatment of 
computer programs and screen displays, 
the Copyright Office determined that 
screen displays should be regarded as an 
“integrally related” part of the computer 
programs to which they relate, rather 
than as separately protectable works. 
Accordingly, the Office now refuses to 
register screen displays separately as 
distinct works of authorship, and regis¬ 
ters them and the computer programs 
whose code generates them as a single 
package. 


84 IEEE MICRO 










Implicit in the Copyright Office’s pol¬ 
icy statement, but left undiscussed, is 
the Office’s apparent conclusion that 
screen displays are or should be in some 
way protected to some extent by exist¬ 
ing copyright law. Discussed only inso¬ 
far as the Office stated that it refused to 
opine on the subject is the issue of what 
is the proper scope of such copyright 
protection in screen displays. 

Some members of the Computer 
Society’s Committee on Public Policy 
felt the Copyright Office’s action was so 
wrong-minded and potentially damaging 
to software progress that it should be 
subjected to judicial review or an at¬ 
tempt should be made to overturn it by 
legislation. 1 A poll of interested mem¬ 
bers, however, led to a majority vote for 
taking no action at this time and instead 
for awaiting further developments. 

This column is the first part of a se¬ 
ries of Micro Law columns evaluating 
the arguments for and against the pro¬ 
tection of user interfaces and screen dis¬ 
plays, and possible legal mechanisms 
for their protection under existing or 
new laws. It is an expansion and restate¬ 
ment of parts of the written statement 
submitted to the Copyright Office on 
behalf of the Computer Society. Here 
and in the next issue I discuss the pos¬ 
sible difficulties and problems that pro¬ 
tection of user interfaces and screen 
displays might cause. 

Subsequent columns conclude that 
protecting user interfaces and screen 
displays under the existing copyright 
laws is, despite some risks, on balance 
and although rather narrowly, the most 
desirable course of legal action. The 
Copyright Office now requires that 
screen displays be legally intertwined 
with their associated computer pro¬ 
grams. This, I suggest, is a serious mis¬ 
take. Treating them as distinct pictorial, 
audiovisual, or other works, as the Com¬ 
puter Society urged, would be better. 

Definition of subject matter 

The user interface for a computer pro¬ 
gram may be defined as the means by 
which users of the program interact with 
it to input data, to direct the perform¬ 
ance of procedures, and to receive infor¬ 
mation produced by the program. While 
this term is too novel to be well defined 
yet, one could understand it in its most 
literal sense. That is, the user interface 
is the boundary between a user and a 
computer system, just as in physics an 
interface is the surface between any two 


phases of a system, such as ice and wa¬ 
ter, water and air, and air and ice—in 
the case of a glass of ice water. If you 
hold a glass of ice water, an interface 
exists between the outside of the glass 
and the outside of the skin of the palm 
of your hand. That terminology or usage 
is not helpful or idiomatic, however, in 
the present context, and it does not re¬ 
late immediately to interesting and con¬ 
troversial legal questions. 


The computers key¬ 
board is a mechanical 
interface; keystrokes 
are the user's method 
of interfacing. 


There are other interfaces found in 
computer-related technology. For ex¬ 
ample, protocols for interconnection of 
computers and peripheral equipment are 
sometimes termed interfaces. Sets of 
commands used in computer languages 
or for interaction of computer programs 
are also sometimes termed interfaces. In 
this and following columns, I use the 
term user interface only to refer to inter¬ 
faces between human users of computer 
programs and computer equipment used 
to execute the computer programs. 

Probably the most common forms of 
user interface are sets of keystrokes by 
means of which a user inputs data to a 
computer or directs performance of pro¬ 
cedures in a computer, via the keyboard 
of the computer. Visual patterns appear 
on the screen of the cathode-ray tube 
monitor for the computer (screen dis¬ 
plays), by which the user perceives out¬ 
puts, receives cues for making inputs, 
and receives other information from the 
computer and computer program. Those, 
too, are user interfaces. 

It may be helpful in some contexts to 
distinguish between a) the mechanical 
or physical aspect of a user interface, 
that is, a device, and b) the aspect that is 
the use or method of using an inter¬ 
face device. Thus, the computer’s key¬ 
board is a mechanical interface device, 
while the keystrokes are the user’s 
method of interfacing by means of a 
keyboard (that is, the user presses the 
keys of the keyboard and causes key¬ 


strokes). The former is tangible and 
relatively permanent; the latter is intan¬ 
gible and ephemeral. 

User interface devices of possibly in¬ 
creasing future importance to microcom¬ 
puter systems are touch screens and 
voice-actuated devices. These permit 
users to input commands by touching a 
part of the screen and by speaking into a 
microphone to the computer. (By the 
same token, the computer can “talk 
back” via a loudspeaker.) 

Special-purpose computer systems 
may utilize other user interface means. 
For example, a coin-operated video 
game machine is a special-purpose mi¬ 
crocomputer in which the user (game 
player) inputs signals (data) to the com¬ 
puter by moving a joystick or trackball 
and pressing buttons. Still another inter¬ 
face device permits players to move 
their hands, thereby changing capaci¬ 
tance or some aspect of a field, and the 
motions are translated into input signals. 
A user performs these acts in response 
to another aspect of the user interface 
for the video game, the screen display. 

Such user interfaces for computer sys¬ 
tems find their parallels in other types of 
electrical and mechanical systems. For 
example, the steering wheel and accel¬ 
erator pedal are user interface devices 
for operating automobiles. The steering 
wheel acts both as an input interface de¬ 
vice and as an output interface device, 
since the driver uses the steering wheel 
to cause the car to change direction and 
at the same time gets feedback from its 
varying resistance, which gives the 
driver a “feel” for road conditions. A 
gearshift is another automobile interface 
device, as is a brake pedal. 

In general, any physical system of 
which a human being is a part or with 
which a human being interacts must 
have a user interface. 

In many systems, the user interface is 
not something of great economic value 
or, at least, not much legal controversy. 
User interfaces for computer programs, 
however, are of great economic value 
and are becoming the subject of increas¬ 
ing legal controversy. Several recent 
court decisions and pending lawsuits in¬ 
volve assertions of rights under the 
copyright laws to particular user inter¬ 
faces for computer programs. As a re¬ 
sult, considerable industry concern has 
been expressed over the implications of 
these assertions of proprietary rights. 

This series explores the kind of legal 
rights that might be asserted over nonde¬ 
vice aspects of user interfaces, of which 


June 1989 85 







Micro Law 


Hardware 

Considerations 

Hardware considerations appro¬ 
priate for this discussion include 
displays and methods of input. 

Displays 

The ordinary form of screen dis¬ 
play for most present microcom¬ 
puter systems is a monochrome or 
color cathode-ray tube monitor, 
usually a raster-scan television 
monitor. Other display technologies 
now exist, however, and are coming 
into increasing use. These include 
plasma, liquid crystal, and elec¬ 
troluminescence. The design 
choices available to screen design¬ 
ers differ with these different dis¬ 
play technologies, and software 
must be written differently to take 
advantage of the unique character¬ 
istics of the various display means. 
The legal and policy points dis¬ 
cussed in this series, however, do 
not turn on the particular display 
technology that is used. 

Input 

The ordinary method for a user to 
input commands or other informa¬ 
tion to a computer is pressing keys 
on a keyboard connected to the 
computer (entering keystrokes). For 
example, in using a computer pro¬ 
gram associated with a menu, a 
user may press an alphanumeric 
key or keys and then the Return or 
Enter key to command the com¬ 
puter to perform a function. An 
example might be entering “P” to 
print a document. In the remainder 
of this series, I use the convention 
of representing keystrokes by angle 
brackets (for example, <P> for key¬ 
stroke “P”). 

Other input user interface 
devices now in use include the 
joystick, trackball, mouse, touch 
screen, and microphone. As in the 
case of display technologies, differ¬ 
ent input technologies have differ¬ 
ent unique characteristics and 
require different software, but they 
do not appear to raise significantly 
different legal or policy issues. 


menu screen displays are illustrative. 
Devices, such as joysticks and touch 
screens, are usually protected by patents 
or other well-understood legal systems, 
and they ordinarily do not raise the per¬ 
plexing issues with which this series 
will struggle. (Hardware considerations 
relevant to this discussion appear in the 
accompanying box.) The series princi¬ 
pally discusses legal rights in screen dis¬ 
plays for computer programs, and secon¬ 
darily discusses legal rights in sets of 
keystrokes and in other possible aspects 
of user interfaces that are generaliza¬ 
tions of, or extrapolations from, 
keystrokes. 

The subject matter involves more than 
the enormous amounts of money that are 
at stake, as between the parties. But that 
cannot be ignored. The installed base of 
Lotus’s 1-2-3 spreadsheet programs, for 
example, has been estimated at more 
than 3 million copies. Current retail 
prices are several hundred dollars per 
copy. 2 Lotus’s 1987 revenue from 1-2-3 
was approximately $260 million. 1 To 
protect this market from competitive en¬ 
croachments, Lotus has brought copy¬ 
right infringement actions against two 
marketers of 1-2-3 “workalikes.” The 
complaints allege that the defendants 
copied the 1-2-3 user interface to divert 
customers from the more expensive Lo¬ 
tus program. 4 - 5 

In addition to the money at stake, 
there are significant public interest im¬ 
plications of decisions to protect and 
refuse to protect user interfaces. More¬ 
over, the questions raised in the debate 
over protecting user interfaces are those 
raised continually, whenever protection 
is sought for some new aspect of soft¬ 
ware technology. The present contro¬ 
versy thus paradigmatically brings for¬ 
ward the arguments and problems that 
will be debated for years to come. Such 
controversy appears each time a new 
form of software technology emerges 
and would-be proprietors of rights in 
such technology seek legal insulation 
from competitive imitation. 


Intrinsic and habit aspects 

Those who market software care a 
great deal about ownership of user inter¬ 
face rights, because the user interface of 
a computer program is a major factor in 
determining its commercial success. 6 
There are two major aspects of the rela¬ 
tion between a user interface and a com¬ 
puter program’s commercial success. 


They may be termed the intrinsic aspect 
and the habit aspect, respectively. 

The intrinsic aspect or qualities of a 
user interface concern its user friendli¬ 
ness. That concept requires considerable 
elaboration. But for the moment user 
friendliness may be equated with the 
ease with which users can learn how to 
use a computer program and the ease 
with which they can then perform tasks 
with it. Some user interfaces make for a 
user-friendly computer program, while 
others exasperate users. A high correla¬ 
tion, to say the least, exists between a 
computer program’s user friendliness 
and its commercial success. 

The habit aspect of a user interface 
involves the fact that users of account¬ 
ing and other business computer pro¬ 
grams display great reluctance to learn 
how to use a new user interface, once 
they have taken the trouble to learn how 
to use one user interface. It also costs 
money to train employees to use a new 
interface, and they make mistakes while 
habituating themselves to it. 

Accordingly, a competitor’s legal ina¬ 
bility to utilize or appropriate (or misap¬ 
propriate, depending on how you stand) 
an established user interface may greatly 
hinder the competitor from marketing a 
competitive program. By the same to¬ 
ken, the prospect that clone-makers will 
be legally disabled from misappropriat¬ 
ing the user interface of a program, once 
it is commercially successful, may fa¬ 
cilitate the financing of a new software 
venture and spur software creativity. 
Thus, customer habituation to a user in¬ 
terface, apart from the intrinsic user 
friendliness of the interface, correlates 
with commercial success for the 
program. 

To be sure, the speed and capabilities 
of a computer program have an effect on 
its commercial success. So does the 
quality of the documentation and the 
user support (hand-holding) made avail¬ 
able to customers. So does advertising 
and promotion. User interface is not the 
only pertinent consideration. Programs 
with unfriendly user interfaces have 
been successful, and users have 
switched from one program to a more 
powerful competitor even though they 
had to learn a new user interface to do 
so. Nevertheless, it must be recognized 
that the user interface is a major factor 
in determining commercial success. It is 
thus no surprise that marketers of com¬ 
mercial software have considered user 
interfaces worth litigating over. Among 
the reported decisions involving un- 


86 IEEE MICRO 







authorized competitive replication of 
screen displays are two cases. 7 " 9 (For a 
brief description of a now-settled copy¬ 
right infringement controversy over a 
computer graphics interface system, see 
my 1986 account in IEEE Micro.' 0 ) 


Code and noncode aspects of 
screen displays 

The electronic circuitry responsible 
for generating a screen display associ¬ 
ated with a computer program is 
controlled by a part of the computer pro¬ 
gram. Typically, the creator of a new 
software product decides that the prod¬ 
uct will perform a particular set of tasks 
or functions and that it will interact with 
the user by means of a particular set of 
screen displays and other means for in¬ 
put/output. The code for executing the 
tasks (for example, word processing, 
spreadsheet calculation, or sorting data) 
will ordinarily be distinct from the code 
for generating the screen displays, al¬ 
though the two must be linked so that 
they co-act. It is common to determine 
the set of screens and pattern of flow 
from screen to screen (starting with the 
main menu) before writing any code for 
the computer program. 

The accompanying box shows a para¬ 
digmatic, simple main menu (Figure A). 
This menu illustrates the screen-to- 
screen flow process, interrelation of dif¬ 
ferent sets of code, and other issues that 
will arise in the discussion that follows. 
The main menu shown in the figure is 
based on a word-processor main menu 
first introduced many years ago. 

Programs for generating screen dis¬ 
plays can be written in different com¬ 
puter languages. For example, the 
computer code for the menu screen dis¬ 
play shown in Figure A could be written 
for an IBM PC-compatible microcom¬ 
puter (PC clone) in Basic, C, DOS batch 
program language, 8088 assembly 
language, and any one of many other 
computer languages. The screen display 
generated would still look the same. 
Moreover, within most languages there 
are different ways to write a program 
that will create the same display. So, 
many different programs, which do not 
resemble one another, can be written to 
produce any particular screen display, 
for a given display device. It is also true 
that minor modifications of a computer 
program for a screen display will cause 
visually significant differences in the 
display. 


Illustrative Main Menu 


The following simple word-proc¬ 
essing main menu illustrates various 
issues that will come up in this and 
future discussions. 

The computer program with which 
such a menu is associated will in¬ 
clude code for generating this menu 
at the start of a session and at other 
appropriate times. The program will 
also contain code that does appropri¬ 
ate things when menu commands are 
invoked. For this word processor 
menu, that would include such things 
as: 

• accessing and displaying a list of 
documents on file, when the key¬ 
stroke <I> is entered; 

• bringing up another menu of se¬ 
lections, when the keystroke <M> is 
entered: and 

• terminating the word processing 
session, when the keystroke <Q> is 
entered. 

Entry of some keystrokes will 
cause "prompts” to appear on the 
screen. For this word processor 
menu, if the keystroke <E> is en¬ 
tered, a prompt will ask the user to 
enter the name of the document that 
is to be edited. When that name is 
entered by means of appropriate key¬ 
strokes, code will cause hardware 
operations to occur: part of the com¬ 
puter program will cause the reading 
head of the disk drive to be posi¬ 
tioned at the beginning of the chosen 


Main Menu 


C = Create a new document 
E = Edit an existing document 
P = Print a document 
I = Index of documents on file 
D = Delete a document from file 
M = More menu selections 
Q = Quit using system 

Type the right letter and press 

Return. 


Figure A. Screen display of a simple 
word processing menu. 


document, read the start of the docu¬ 
ment into the computer’s random ac¬ 
cess memory, and display the begin¬ 
ning of the document on the screen. 

Carrying out a prompt or entering 
a command may cause another menu 
of commands to appear. Thus, if the 
keystroke <P> is entered, a prompt 
will appear on the screen asking the 
user to enter the name of the docu¬ 
ment that is to be printed. When that 
is done, the computer program will 
bring up another menu screen so that 
the user can enter commands about 
how to format the printing of that 
document. And when that is done, 
the computer program will send sig¬ 
nals to the printer to initiate the 
printing of the document. 


Most of the litigation and legal dis¬ 
putes concerning screen displays that 
have been publicized so far have in¬ 
volved menus. Screen displays can also 
be the visual result of inputting data to a 
program. Such an output display may be 
static or animated. Programs now exist 
that produce pictorial displays showing 
the result of a computer simulation of a 
mathematical model of a physical sys¬ 
tem, for particular parameter inputs. 
(See, for example, the Simscript II ad¬ 
vertisement on the back cover of the 
April 1989 issue of IEEE Micro.) Icons, 
histograms, and the like appear on a 
screen instead of a table of numbers; 


this type of display is intended to facili¬ 
tate comprehension of the simulation. 
Animated displays of this type are simi¬ 
lar to video game displays. It is reason¬ 
able to anticipate future litigation over 
such screen displays. 

The time and effort spent in preparing 
screen displays is directed to more than 
just writing the actual code that gener¬ 
ates a screen display. Much of it goes 
into deciding what to include in the 
screen display, how to relate each screen 
to other screens used with the same pro¬ 
gram, and how to place the information 
on the screen in a way that will make it 
easy for the user to utilize the program. 


June 1989 87 





















Micro Law 


Call for 
Papers 

IEEE Micro seeks 
manuscripts for 
general-interest 
issues in 1990. 


Topics of particular 

interest include 

□ neural networks 

□ artificial intelligence 

□ special-purpose 
computers 

□ optical computers 
and interfaces 

□ workstations 

□ use of microproces¬ 
sors in parallel 
computers 

□ VHDL design 

□ silicon compilation 

□ biological computing 

□ and tutorials on all 
micro-related topics. 


Submit manuscripts to: 

Joe Hootman, Editor-in-Chief, 
EE Dept., University of North 
Dakota, PO Box 7165, Grand 
Forks, ND 58202, phone 
(701) 777-4331. 


All of these may be identified with mak¬ 
ing the computer program user-friendly 
and with human factors analysis. 

Considerably more effort is usually 
devoted to these noncode aspects of 
screen display design than to the actual 
coding for generation of the screen dis¬ 
play. Moreover, the coding effort for 
display code is often (but is not neces¬ 
sarily) quite routine compared to the 
effort that precedes the coding. Irrespec¬ 
tive of whether the coding is routine 
compared to the screen designs, the fact 
remains that the nature of the authorship 
is different, and even the persons who 
design the screens are often different 
from those who write the display or 
working code. (For example, it is well 
known that Mitch Kapor, often credited 
with writing Lotus 1-2-3, did not write a 
single line of the code. Kapor designed 
many of the screens. Jonathan Sachs 
wrote almost all of the code and de¬ 
signed some of the screens.) 

Designing effective screens for a 
computer program is an important crea¬ 
tive effort, whose results significantly 
affect the popularity of the computer 
program. A computer program may be 
functionally very advanced, and it may 
have features that competitive programs 
lack. But it is unlikely to be commer¬ 
cially successful if it is hard for the pub¬ 
lic to learn to use the program. How the 
set of screens associated with a com¬ 
puter program is designed is a major 
determining factor of how easy it is to 
use the program. 

Like everything else, the creation of 
screen displays and other aspects of 
computer program user interfaces, and 
thus progress in this aspect of computer 
graphics, has to pay its own way. Some¬ 
body has to pay the salary of screen 
designers. Or if the designers of screens 
are individual entrepreneurs, they have 
to be willing, in effect, to pay their own 
salaries in terms of anticipated entrepre¬ 
neurial profits. Comparative data on the 
costs of coding and designing screens is 
not available, but it is clear that the cost 
of designing screens and other aspects 
of the user interface for a computer pro¬ 
gram is substantial relative to the total 
cost of development. 

In the next issue I will continue with 
a discussion of the possible difficulties 
and problems that protection of user 
interfaces and screen displays might 
cause. As mentioned earlier, subsequent 
issues will present conclusions based on 
these discussions. 


References 

1. Computer, Vol. 21, No. 9, Sept. 1988, 
p. 74. 

2. Computer & Software News, Oct. 12, 
1987, p. 1. 

3. Wall Street J„ Aug. 30, 1988, p. 11, 
col. 2. 

4. Lotus Development Corp. v. Paperback 
Software, Inc., D. Mass., Civ. No. 87- 
0076K (suit over screen displays and 
user interface of 1-2-3 spreadsheet com¬ 
puter program). 

5. Lotus Development Corp. v. Mosaic 
Software, Inc., D. Mass., Civ. No. 87- 
0074K (same). 

6. M. Dailey, “The ‘Look and Feel’ of 
Copyrightable Expression;” 9 Eur. Intel. 
Prop. Rev. 234, 235 (1987) (counsel for 
leading Crosstalk XVI communications 
computer program attributes commercial 
success of program “largely” to its 
screen displays). 

7. Whelan Associates, Inc. v. Jaslow Den¬ 
tal Laboratory, Inc., 797 F.2d 1222 (3d 
Cir. 1986), cert, denied, 107 S. Ct. 877 
(1987). 

8. Digital Communications Assoc., Inc. v. 
Softklone Distrib. Corp., 2 U.S.P.Q.2d 
1385 (N.D. Ga. 1987). 

9. Broderbund Software, Inc. v. Unison 
World, Inc., 648 F. Supp. 1127 (N.D. 
Cal. 1986). 

10. R.H. Stern, “Micro Law: The look, feel, 
taste, and smell of software,” IEEE 
Micro, Apr. 1986, pp. 64-65. 


Reader Interest Survey 

Indicate your interest in this department 
by circling the appropriate number on 
the Reader Service Card. 

Low 171 Medium 172 High 173 


88 IEEE MICRO 



















Department 


New Products 

Marlin H. Mickle 
University of Pittsburgh 

Send announcements of new microcomputer and microprocessor products, 
and products for review, to Managing Editor, IEEE Micro, 10662 Los 
Vaqueros Circle, Los Alamitos, CA 90720-2578. 



A holistic response to viruses 


According to the company, the 
Immune System desktop system 
protects user files from viral attack as 
well as any other external or internal 
threats. The system does not allow 
unauthorized .EXE and .COM files 
to enter or run on the computer. 

CRC integrity checks establish the 
initial conditions of files or programs 
and run file-integrity procedures. 

Hardware and software function 
together within a complete system to 
maximize security and prevent 
problems with coordinating parts of 
separately secured elements. 
Communications between hardware, 
software, and a secured kernel fend 
off worms, Trojan horses, bombs, or 
other intrusions by means of a 
number of layered and interactive 
proprietary strategies. 

Encryption occurs in hardware— 
rather than software—according to 
US-government specifications. 
Hardware includes a 12-MHz 80286 
processor with an 80287 socket, 1 
Mbyte of DRAM, a 40-Mbyte hard 
disk, one parallel port, and two 
RS-232 serial ports. A secured clock 
prevents the casual user from 
modifying the system time-date 
stamp. Because some parts of the 



The Immune System operates in a 
synergistic fashion that links hardware, 
software, and a small, highly protected 
kernel to secure sensitive data. 


security system do not need hard-disk 
interaction, a section of RAM holds 
pieces of software in an EPROM-like 
approach to execute security 
measures. A set of proprietary chips 
performs anti-hacker write-protect 


services. 

Programmed security parameters 
are stored in software along with 
sensitive data, like audit trails. 

Utilities execute from within 
software. Conventional software 
packages work on the system except 
for programs that violate security 
vectors during their regular 
operations. The company can correct 
these exceptions. 

A secured piece of system RAM 
interprets all programmed rules and 
mediates real-time transactions. 

User-identification devices include 
such biometric interfaces as retinal 
and fingerprint investigations. 
According to the company, if a data 
thief succeeds in disturbing hardware 
security, software recognizes the 
tampering and shuts down the entire 
system while securing user data. 

Tele-comsec software supports 
telecommunications security; the 
system is Hayes-modem compatible. 
American Computer Security 
Industries; $2,995 (Model C2/286-40 
with an HGC-compatible mono¬ 
graphics card, no monitor); OEM 
pricing available. 

Reader Service Number 12 


Utility merges with editor 

The Vedit/SMK package comprises 
the Seidl Make Utility software- 
generation system and the Vedit Plus 
3.0 multifile programmable text 
editor. Users automatically generate 
object files, libraries, and executable 
files without having to leave the 
editor. Compuview Products; $334. 

Reader Service Number 10 


Card diagnoses dead PCs 

The Power-on Self Test diagnostics 
card, or Postcard, can perform diag¬ 
nostics without the support of an 
operating system. The card lets 
developers debug systems at various 
stages of development or trouble¬ 
shoot in the field. Award Software. 

Reader Service Number 11 


Make your own CASE 

A PC-based workbench enables 
Sylva Foundry MS-DOS users to 
structure their own CASE tools, 
methods, techniques, and environ¬ 
ments. The workbench helps users 
embed invisible text and integrate 
their own trigger programs. Cadware; 
from $8,500. 

Reader Service Number 13 


June 1989 89 




































New Products 


One family employs another 

The Tek XD88 family includes two 
graphics workstations, an applica¬ 
tions processor, and a file server, all 
based on the 88000 family. Maximum 
speeds reach the equivalent of 17 
VAX MIPS, 34,000 Dhrystones/s, 16 
million single-precision Whetstones, 
and 12 Mflops. The 88100 processor 
provides on-chip integer and floating¬ 
point multiplication. Four 88200 
cache-memory management units 
contain 64-Kbyte storage. Users can 
add four more units. 

The 3D XD88/30 workstation 
provides wireframe, shaded solid, and 
true-color bit-plane configurations 
with up to 1,310,720 colors. The 2D 
XD88/20 displays 256 colors. The 
XD88/01 applications processor can 
host Tektronix terminals or network 
stations. The XD88/05 file server 
includes the applications processor, 

1.8 Gbytes of disk storage, and 2 
Gbytes of streamer tape. Tektronix; 
from $34,950 (XD88/30); from 
$29,950 (XD88/20); $24,950 
(XD88/01); $75,000 (XD88/05). 

XD88/30 Reader Service Number 14 
XD88/20 Reader Service Number 15 
XD88/01 Reader Service Number 16 
XD88/05 Reader Service Number 17 


Tools develop 88000 software 

Oasys 88K Tools brings RISC 
applications development to the DEC 
VAX, the Sun-3 series, the Apple 
Macintosh II, and the Motorola VME 
Delta platform. The cross-compiler, 
assembler/linker, debugger, and 
simulator also run on 88000-based 
systems and comply with the Binary 
Compatibility Standard. Oasys, Inc.; 
$15,500 (VAX); $9,400 (Sun, Delta); 
$4,000 (Mac II). 

VAX Reader Service Number 18 
Sun Reader Service Number 19 
Delta Reader Service Number 20 
Mac II Reader Service Number 21 


More on the business of RISC 


Hypermodule supports VMEbus 



The MVME188 RISC multiprocessor quadruple subsystem contains eight 
MC88200 cache-management units and four MC88100 microprocessors. 


The MVME188 series of 
multiprocessor RISC boards provide 
up to 60 MIPS of processing power 
with a 128-Kbyte cache and up to 64 
Mbytes of shared main memory in a 
VME module. Applications include 
multiuser computation servers, 
network/communications controllers, 
and large file and database servers. A 
VMEbus master/slave interface and a 
six-unit form factor provide 
controller compatibility. Software 
support includes Unix Version 3 and 
real-time operating systems, develop¬ 
ment tools, communications, and 
applications. 


MVME188 20-MHz single proces¬ 
sors run at 3.75 Mflops (double¬ 
precision Linpack), 35,700 Dhry¬ 
stones/s, and the equivalent of 15 
VAX MIPS. MVME188s are compa¬ 
tible with Binary Compatibility 
Standard software. The company 
plans dual- and quad-processor 
versions for this month. Motorola; 
$22,950 (single) (100s); $27,200 (dual) 
(100s); $33,500 (quad) (100s). 

Single Reader Service Number 22 
Dual Reader Service Number 23 
Quad Reader Service Number 24 


Weitek delivers 3D 

The XL-8832 floating-point 
processor offers 20 Mflops of 
performance for graphics 3520 and 
3540 VAXstations. The RISC proces¬ 
sor provides single-precision format 
support in a chip set that consists of 
the XL-8136 program-sequencing 
unit, the XL-8137 integer-processing 


unit, and the XL-3832 floating-point 
unit. Each device comes in a 144-pin 
grid array package. Software develop¬ 
ment tools include an optimizing C 
compiler, assembler, linker, and 
debugger. Weitek; $750 (XL-8832) 
(1,000s) (OEM). 

Reader Service Number 25 


90 IEEE MICRO 












Systems/servers use two 
processors 

Aviion computer system/servers 
and workstations comprise a family 
of RISC-based distributed applica¬ 
tions architectures. The system/server 
supports symmetric multiprocessing, 
incorporates the VME data path, and 
uses either one or two processors. The 
system becomes a 250-user server in 
networking applications that can 
connect 88000- and non-88000-based 
workstations. Dual-processor server/ 
systems have a performance rating of 
40 MIPS. 

Desktop workstations come with 4 
million bytes of main memory that is 
expandable to 28 million bytes. Three 
data-storage devices can be attached 
to the system including a 322-Mbyte 
mass-storage disk and 150-Mbyte 
magnetic tape cartridge units. 
Workstation performance reaches 20 
MIPS. 

The Aviion series uses the DG/UX 
4.1 revision of the company’s Unix 
operating system, which is compatible 
with Unix Version 3, BSD 4.2, and 
Posix. Data General; from $52,000 
(system/server); $7,450 (workstation). 

Server Reader Service Number 26 

Station Reader Service Number 27 


Unix coprocessor hits 17 MIPS 

The Series 400 Personal Mainframe 
coprocessor makes use of the Moto¬ 
rola 88000 reduced instruction set 
architecture. The 32-bit machine for 
computationally intensive applica¬ 
tions provides concurrent use of MS- 
DOS and Unix System V operating 
systems at maximum speeds of 17 
MIPS. This Unix coprocessor system 
employs the IBM PC AT/XT or 
PS/2 Model 30/35 as an I/O proces¬ 
sor or subsystem for workstations 
and multiuser configurations. 

Total physical memory reaches 20 


MIPS displays own 
computations 

The RC2030 Riscomputer provides 
12 MIPS of speed and 1.8 double¬ 
precision Mflops of computational 
power and plays multiple roles in 
distributed computing. The desktop 
workstation can perform either as a 
host for local-station work groups 
and X-display stations or as a 
networked file server. A 16.67-MHz 
R2000 CPU, an R2010 FPU, and 
separate 32-Kbyte instruction and 
data caches support this performance. 
For large applications, the Riscom¬ 
puter desktop workstation can 
support up to 16 Mbytes of main 
memory and 344 Mbytes of disk 
storage. 

While the RC2030 computes, the 
RS1210 X-Display station processes 
the graphics portion of an applica¬ 
tion. The 16-inch, 105-dpi mono¬ 
chrome display features an X Windows 
server, a resolution of 1,024 x 1,024, 
and a 70-Hz, noninterlaced refresh 
rate. MIPS Computer System; from 
$17,000 (RC2030); from $3,200 
(RS1210). 

RC2030 Reader Service Number 28 

RS1210 Reader Service Number 29 


Mbytes in a 4-gigabyte virtual 
addressing space. Features include the 
X Window System Version 11.3 and 
binary compatibility with other 88000 
Unix systems. 

A related product, the Personal 
Mainframe/88SDS software develop¬ 
ment system, offers C and Fortran 
compilers. Opus Systems; from 
$5,000 (coprocessor) (OEM); from 
$11,200 (SDS). 

CoprocessorReaderServiceNumber 31 
SDS Reader Service Number 32 


The ins and outs 
of data 


Pick better performance 

According to the company, the 
Pik-fast digitizer table overlay reduces 
input time to Cadkey microcomputer- 
based 3D CAD systems by 60 per¬ 
cent. The overlay compresses the 
number of screen-menu selections 
required by a mouse into a general, 
picking-device operation. This 
function also drives immediate mode 
commands, on-line calculation, and 
display-status controls. Other 
selections are organized by linear- 
motion, color-key and icon-represen¬ 
tation strategies. A centrally located 
view visualizer and user-definable 
selection area complete the package. 
Jensen Properties (supports Cadkey 
3.12; free upgrade with Cadkey 3.5 
release). 

Reader Service Number 33 


Transducer boasts big buffers 

The R1000 waveform digitizer 
features four channels that each have 
an 8-bit, 500-KHz A/D converter. 
Sample data buffers encompass 32 
Kbytes per channel. The turnkey 
peripheral for PCs comes with digital- 
scope software and drivers for the C, 
Turbo Pascal, and Basic program¬ 
ming languages. Rapid Systems; 
$1,995. 

Reader Service Number 34 


Another mouse alternative 

The Trackball Plus cursor pointing 
device emulates Microsoft and Mouse 
Systems mice and the Bit Pad One 
digitizing tablet. The six-button 
trackball also supports Lotus 1-2-3, 
Word Perfect, and DOS commands 
through pop-up menus. An RS-232 
serial port connects to CAD/CAM 
environments. Fulcrum Computer 
Products; $99. 

Reader Service Number 35 


RISCs support host-coupled graphics 


The configurable Adage 200 Color 
Graphics Processor houses a 192- 
register file, 25-MHz Am29000 CPU. 
Virtual Windows technology 
accompanies a 12 x 24 look-up table 
for display of 4,096 colors with a 
1,280 x 1,024, 60-Hz resolution. 
Users can combine the VLSI two- 


board set with an Egos graphics 
operating system. Custom options 
include a 2,560 x 2,048 frame 
buffer, an expansion VME chassis, 
and a 25-MHz Am29027 math copro¬ 
cessor. Adage, Inc. 

Reader Service Number 30 


June 1989 91 









New Products 


Company offers “mouseboard” 



Keytrak integrates the two most common manual-input devices for PCs: the 
keyboard and the mouse. 


For users who want to use a mouse 
without abandoning their keyboard, 
the Keytrak trackball integrates the 
two. The package is compatible with 
both Microsoft and Mouse Systems 
serial mouse drivers. Users can select 
either the IBM PC AT or XT through 
a switch under the keyboard. A 

All-in-one graphics touch terminal 

The Touchcom GTS combines 
touch entry, graphics, data communi¬ 
cations, and video technologies into 
one PCB. The board can be used as a 
terminal or stand-alone system 
instead of a keyboard. Features 
include a 768 x 480 x 8-pixel 
modular graphics memory and a 
graphics processor with a high-level 
command language, two RGB video 
inputs, and resistive touch-screen 
inputs. 

An on-board, 16-MHz 800 86 


Y-shaped cable connects the 
keyboard and serial ports. Three 
program-dependent mouse buttons 
reside above the trackball and a two- 
button mouse is duplicated on the 
keyboard for flexibility. Octave 
Systems; $189. 

Reader Service Number 37 


microprocessor and a modular 
program memory that accommodates 
a mix of PROM and RAM of up to 1 
Mbyte can substitute for disk memory 
and an external microcomputer. 
Software includes the Vrtx real-time 
operating system, a graphics 
operating system, an X.25 data- 
communications package, and a PC- 
compatible ROM BIOS. Digital 
Techniques; under $2,000 (OEMs). 

Reader Service Number 38 


Touch screen supports graphics 



The Lucas Duralith touch screen 
provides 1,024 x 1,024-pixel input 
from front-control panels to 
applications ranging from medical 
instrumentation to automated 
process-control systems. 

The Lucas Duralith touch screen 
combines with custom graphic pre¬ 
sentations for insertion into pre¬ 
assembled front panels. Users can 
either mount the panel to rigid sub¬ 
panels or to a PCB. When users press 
two resistive panels together, voltage 
drops at the intersection and input 
changes from analog to digital. 

Design options include interchange¬ 
able legends, tactile response 
switches, and formed overlays. 
Contact company for custom pricing. 
Lucas Duralith. 

Reader Service Number 40 


Translator standardizes PCB 
layout 

The IGES Board Station Trans¬ 
lator translates graphics and related 
data files from the Board Station 
PCB layout system to the Inter¬ 
national Graphics Exchange Stan¬ 
dard. It also translates from standard 
IGES 4.0 files to Board Station 
design files. Mentor Graphics; 
$15,000 (site license). 

Reader Service Number 36 


Data skis cross-country 

The international edition of the 
Xchange PC software conversion 
program reads MS-DOS country code 
settings and allows users to display 
and access accented characters and 
international punctuation marks. The 
enhancement “speaks” and displays 
the local language in use. Emulation 
Technologies; $745 (Int’l edition); 
$150 (upgrade from Xchange). 

Reader Service Number 39 


Reader Interest Survey 

Indicate your interest in this department 
by circling the appropriate number on 
the Reader Interest Card. 

Low 180 Medium 181 High 182 


92 IEEE MICRO 











Advertiser/Product Index 


CACI Products Company 


Cover IV 


RS # Page # 


Fujitsu Ltd.Cover II 

Visible Systems Corp. 1 


BOARDS 

Multiprocessor 

RISC board 22-24 90 


CHIPS 

Coprocessor 1,31 C.11,91 

Floating-point processor 25 90 

Microprocessor 1 C.II 


"MICRQ/ 

Coming in August... 


DATA ACQUISITION 


Digitizer table 

33 

916 

Waveform digitizer 

34 

91 


DATA COMMUNICATIONS 


Applications processor 

16 

90 

File server 

17 

89 

Graphics translator 

36 

92 


•High-performance microprocessors 
featuring the 64-bit Intel i860 

• Special Feature: Comparing RISC 
architectures 


I/O RELATED EQUIPMENT 


Cursor 

35 

91 

Touch screen 

40 

92 

Touch terminal 

38 

92 

Trackball 

37 

92 

2D color display 

15 

90 



SOFTWARE 

CASE tool 

2 

1 


Simulation package 

— 

C.IV 


Software conversion program 39 

92 

FOR DISPLAY ADVERTISING INFORMATION, CONTACT: 



Northern California and Pacific Northwest: Roy McDonald Assoc. Inc., 

5915 Hollis St., Emeryville, CA 94608; (415)653-2122. 

SYSTEMS 



Jim Olsen, P.O. Box 696, Hillsboro, OR 97123; (503) 640-2011. 

Color graphics processor 

30 

91 

Southern California and Mountain States: Richard C. Faust Co., 24050 

Computer/server 

26-27 

91 

Madison St., Suite 100, Torrance, CA 90505; (213) 373-9604. 

Development environment 

1 

C.II 

Southwest: The House Co., 5252 Westchester, Suite 280, Houston, TX 

PC-based workbench 

13 

89 

77005; (713)668-1007. 

RISC applications 



East Coast: Atlantic Representative Group, 349 Maple Place, Keyport, NJ 

development 

18-21 

90 

0/735; (201)739-1444. 

RISC computer 

28-29 

91 

New England: Arpin Associates, 40 Sterling St., Somerville, MA 02144; 

Security system 

12 

89 

(617)625-1777 

Software development system 32 

91 

Europe: Heinz J. Gorgens, Parkstrasse 8a, D-4054 Nettetal 1 - Hinsbeck 

Software-generation system 10 

89 

(F.R.G.); phone: (0 21 53) 8 99 88; telex 841 (17)2153310=HJG tlx d. 

Graphics workstation 

14 

90 


For production information, conference, and classified advertising, contact 
Heidi Rex or Marian Tibayan. 

IEEE MICRO. 10662 Los Vaqueros Cir., Los Alamitos. CA 90720; phone TEST & MEASUREMENT EQUIP. 

(714) 821-8380; fax (714) 821-4010, Diagnostic card 11 89 


June 1989 93 











continued from p. 8 


Micro World 



Neural Network Research Activities 


Institute/Contact 

Research specialty Institute/Contact 

Research specialty 


University of Stuttgart Pattern recognition 
H. Haken 


Nuclear Energy 
Research Institute 
(KFA) Juelich 
H. Mueller-Krumbhaar 

GMD, Society for 
Mathematics and 
Data Processing 
St. Augustin 
H. Muehlenbein 

Israel 

Hebrew University 
Jerusalem 

D. J. Amit 

Weizmann Institute 
Rehovot 

E. Domany 

Italy 

University of Rome 
G. Parisi 


Fuzzy logic 
Optimization 


Neural network 
implementation 
Genetic algorithms 


Hopfield networks 
Pattern association 
Statistic physical methods 

Associative memories 


Associative memories 
Hierarchical networks 
Learning 


IBM Rome 
S. Patarnello 


University of Zurich 
R. Pfeifer 

Asea Brown Boveri 
Research Center 
Baden 

J. Bernasconi 


Boolean networks 
Analysis of learning 
and generalization 


Switzerland 

Swiss Federal Insitute 
of Technology, Zurich 
O. Kuebler 

Swiss Federal Insitute 
of Technology 
Lausanne 
J.D. Nicoud 

CSEM Neucahtel 
E. Vittoz 


Image processing 


Neural networks for 
robotics 

VLSI implementation 
Teaching aids 

Robot control 
Topology-conserving maps 
Signal separation 

Artificial intelligence 

Algorithms 
Process automation 


Reader Interest Survey 

Indicate your interest in this department by circling the appropriate number on the Reader Service Card. 

Low 174 Medium 175 High 176 


Micro View 

continued from p. 96 


How many pixels do you provide? 

Almost a million. Specifically, 1,120 
X 832 X 2 bits deep and 94 dots per 
inch, giving us four shades of gray— 
black, dark gray, light gray, and white. 
The four shades render images better 
than black and white. We would have 
liked to offer more shades, but we had 
to keep the price down. 

People seem to like color, too, but you 
decided against it. 

Yes, they do—price again forced our 
decision. The Next cube presently has 
three vacant slots, so one of the options 
is going to be—in about a year—a 
graphics card that drives a color monitor. 


Let’s talk about software briefly. Why 
did you select Unix? 

It took a little time to sort out the 
merits of Unix System V v. Berkeley 
4.3. We think the two are slowly coming 
together, but it became clear that 
Berkeley 4.3 was the right one to be 
compatible with. It is more widely used 
in the university environment. We fi¬ 
nally selected Carnegie Mellon 
University’s Mach multitasking operat¬ 
ing system, which is compatible with 
Unix 4.3. 

How did you make Unix easy to use? 

We developed a user interface and 
development environment that we call 


Next Step. For the end user this environ¬ 
ment masks the complexities of the 
operating system behind a window- 
based, graphical workspace. The user 
can locate and manage files, display the 
contents of directories, and launch 
applications and utilities, all without 
any knowledge of Unix. 

What did you do to achieve more per¬ 
formance? 

The first step was to select the fastest 
commercial processors for the portion of 
the work they are good at. Our group 
has had long experience with the Motor¬ 
ola 68000 family. We compared it with 
the other choices that were available 


94 IEEE MICRO 












back in 1986 and found it was still a 
good choice. We decided to go with 
Motorola’s then top-of-the-line micro¬ 
processor and memory-management 
unit, the 68030, a 25-MHz processor 
capable of 5 MIPS. To its capabilities, 
we added the MC68882 floating-point 
coprocessor for mathematical computa¬ 
tions and the Motorola DSP56001 
digital signal processor to support 
computation-intensive processes such 
as sound synthesis. 

What is the purpose of the digital sig¬ 
nal processor? 

The DSP makes it possible to utilize 
the sound and visual pathways to the 
brain so learning and communicating are 
easier. At 10 MIPS the DSP can manip¬ 
ulate waveforms for sound, music, im¬ 
ages, real-time analysis of experimental 
data, and many other phenomena. 

What was your second step toward 
high performance? 

Moving data around ordinarily eats up 
a lot of processor time. So we separated 
that function from the CPU and put it in 
specially designed DMA (direct memory 
access) and data controllers. Our inte¬ 
grated channel processor, for example, 
manages the flow of data within the sys¬ 
tem, particularly between main memory, 
the CPU, and peripheral devices. The 
optical storage processor handles data 
flow for the optical disk. Those were 
two big efforts. Each is contained in one 
proprietary VLSI chip. 

Your decision about main memory is 
striking. Why did you go to eight 
megabytes? 

At first we thought that four mega¬ 
bytes would be sufficient for someone to 
run the operating system, the window 
system, and a small application. But 
then we realized that pretty soon a user 
would want a larger application or a sec¬ 
ond application and then would need 
more memory. So we decided eight 
megabytes was the right amount. In ad¬ 
dition, a user can add four or eight more 
megabytes as options. 

Of the eight megabytes how much is 
left for the user after you load the 
operating system and the other 
necessities? 

After the first application, say the 
word processor, I think there are about 
three megabytes left, enough to get a 
couple more applications in, such as 
printing. Incidentally, all of the elec¬ 


tronics fits on one card in a one-foot 
cube, powered by a 200-watt supply, so 
the machine is small, cool, and quiet. 

Why did you select a magneto-optical 
disk and not a large Winchester? 

A couple of reasons: removability and 
reliability. Users can remove the small, 
3.5-inch, 256-megabyte optical disk and 
walk away with all of their files. They 
can take the files home or on a trip. 

With a Winchester magnetic drive, how¬ 
ever, to do this users have to spend 20 
minutes running their files out on a port¬ 
able tape. And they have to have the 
tape drives to do it. Second, the optical 
disk offers higher reliability than the 
Winchesters because the head is much 
farther away from the medium. You 
don’t have the head-crash problem 
found on the Winchesters. 

For these reasons we think the optical 
disk is going to be the storage technol¬ 
ogy of the 1990s. Still, we offer 330- 
Mbyte and 660-Mbyte Winchester 
options for those who want a lot of 
hard-disk storage. 


How long does it take to bring some¬ 
thing up from the optical disk? 

To launch a program, like the Webster 
dictionary or the Shakespeare plays 
(standard issue with the original optical 
disk), takes 10 or 15 seconds, depending 
on the program size. Once a program is 
loaded, finding a particular detail takes 
only a second or two. 

New application programs today are 
usually distributed on inexpensive 
floppy disks. With only the optical 
drive available, how do you plan to 
get new programs out to the user? 

In the short term we expect that soft¬ 
ware developers are going to use the op¬ 
tical disk as the distribution medium. 
People who come from the PC market¬ 
place think its $50 price is very expen¬ 
sive. Still, for programs that arrive on 
seven or eight floppy disks, $50 is not 
too costly. People who come from the 
workstation marketplace—and use car¬ 
tridge tapes costing $30 apiece—don’t 
seem to be bothered by the $50 price. In 
the long term we expect to see more of 
the Next machines on networks with us¬ 
ers downloading application programs. 

Why did you limit your hard-copy 
output to your own laser printer? 

Once you use a laser printer, you get 
used to its speed, quality, and relative 


quietness compared to the dot matrix 
printer. What people really wanted, we 
found, is a fairly priced laser printer. So 
we set out to build one at the lowest 
possible cost, and, at $2,000, we think 
we have one. Moreover, at 400 dots per 
inch, it is higher quality than the usual 
300-dpi printer. 

How did you get screen and print out¬ 
put to match? 

Many systems have one language for 
showing elements on the screen and a 
separate language for outputting them 
on the printer. Then a programmer has 
to write one piece of code for the screen 
and another piece for the printer. The 
two outputs don’t always precisely 
match. 

We worked with Adobe Systems to 
develop Postscript to serve both func¬ 
tions. Our original concern was whether 
Postscript would be fast enough for the 
screen, but we got it to the point where 
performance is quite good. So we 
wound up with one language. Postscript, 
for both purposes. It makes application 
development a lot easier. 

Earlier you mentioned some trade¬ 
offs made to achieve the goals the 
users wanted. Did you have to trade 
off anything else? 

The machine is more expensive 
($6,500 for the educational marketplace; 
$10,000 commercially) than we would 
have liked. Over time we hope to be 
able to trim both those numbers. 

Anything else you’d like to add for 
our readers? 

Beyond the design of the product, the 
three biggest developments we’re 
pleased about are that more than 50 
third parties are writing programs for 
the machine, IBM has licensed the Next 
Step environment, and Businessland is 
distributing the product in a second 
marketplace. 


Reader Interest Survey 

Indicate your interest in this department 
by circling the appropriate number on 
the Reader Service Card. 

Low 186 Medium 187 High 188 


June 1989 95 








Department 


Micro View 


Design choices 
power the 
Next wave 

Ware Myers 
Contributing Editor 



The Next workspace reminds us of the Macintosh—yet it's different. Its high 
resolution (94 dots per inch—the original Macintosh was 70) avoids jaggies as 
seen in the upper left diagrams and provides texture on the atoms. Its rapid re¬ 
fresh rate (68 Hz) makes rock-solid pictures. The small diagrams at the right 
margin are icons. 


T his month Next Inc. begins volume 
distribution of the Next Computer 
System from its automated plant in 
Fremont, California. The Businessland 
chain of computer stores recently com¬ 
mitted to take $100,000,000 of the 
machines—-a cross between a personal 
computer and a workstation-—in the first 
year. In addition, Steven P. Jobs’ latest 
enterprise will continue to sell directly 
to the market for which it was originally 
designed—academe. 

As Next views the personal-computer 
scene, there have been three great waves 
so far—the Apple, the IBM PC and its 
clones, and the Macintosh. The fourth 
wave, the company believes, will be its 
Next system. How do you go about cre¬ 
ating a new wave? Steve Jobs and his 
long-time associates know something 
about that. They have done it twice 
before. 

It seems you ask potential users what 
they would like to have in a system. 
Then you scour the world of technology 
for what can be pulled together in time 
for the new machine. You do all this 
before anyone else can. You squeeze as 
many dollars out of the cost as you can. 
Then make a big splash so the world of 
users knows what you have. Sound 
easy? Sure, but it is extremely difficult 
to make all the pieces fit together. 

96 IEEE MICRO 


IEEE Micro asked Next’s Richard A. 
Page how the marketplace requirements 
fed back into the choices the design 
team made. Page is vice president of 
digital hardware engineering and one of 
six Next founders. Earlier he was instru¬ 
mental in the initial design of Lisa at 
Apple and was responsible for the deci¬ 
sion to use the Motorola M68000 family 
in the Lisa and Macintosh computers. 
We caught him in a brief interlude be¬ 
tween trips to Europe and Japan. 

What did the marketplace want? 

We spoke with potential users at more 
than a dozen universities—our original 
target market—in the fall of 1985 and 
early 1986. They said they needed 
something more than a personal com¬ 
puter. They liked the power of the work¬ 
station, but they characterized it as big, 
hot, and noisy. They enjoyed the attrib¬ 
utes of the personal computer—small, 
cool, quiet, and reliable. 

What else did the users ask for? 

They wanted a big enough screen to 
display a full page of text. They were 
interested in a Unix-based system, but 
they wanted it to be easy to use. Of 
course, they expected a well-designed 
and consistent workspace. Also, they 
wanted more performance and more 

0272- 1732/89/0600-0096S01.00 © 1989 IEEE 


memory—naturally. Learning, or com¬ 
munications in general, involves more 
than just words. Pictures and sound also 
help get the meaning across. So we now 
had more performance requirements. 

In addition, they had quantities of 
information they wanted to have readily 
available. That meant a lot of mass 
storage. Some of them were getting 
accustomed to laser printing, so they 
wanted that capability, too. They wanted 
it to be affordable, and they wanted the 
output to look the same as what they 
saw on the screen. 

That is a big order for a personal 
computer, or even a workstation. 
Where did you start? 

One of the first decisions we had to 
make concerned the size of the display. 
A 15-inch size is too small to show a 
full page of text. Yet, a 19-inch size is 
large for a desktop system—users 
wanted to get away from it. We settled 
on 17 inches. The determining factor 
was to have enough pixels so that the 
screen appears to show a full page, since 
people relate best to what they see on 
paper. That is why we are running 
black letters on a white (or gray) back¬ 
ground, too. 

continued on p. 94 


































a 


IEEE COMPUTER SOCIETY 

A member society of the Institute of Electrical and Electronics Engineers, Inc. 


Executive Committee 

President: Kenneth R. Anderson* 

Siemens Research & Technology 
755 College Road East 
Princeton, NJ 08540 
(609) 734-6550 

President-Elect: Helen M. Wood* 

Past President: Edward A. Parrish, Jr.* 

Vice Presidents 

Conferences and Tutorials: Joseph E. Urban (1st VP)* 
Technical Activities: Laurel V. Kaleda (2nd VP)* 
Area Activities: Ned Kornfield 1 
Education: Gerald L. Engel 1, 

Membership and Information: Barry W. Johnson 1 
Press Activities: Duncan H. Lawrie* 
Publications: Sallie V. Sheppard* 

Standards: Paul L. Borriil* 

Secretary: Michael Evangelist* 

Treasurer: Charles B. Silio 1 
Division V Director: Harriett Rigas 1 
Division VIII Director: Roy L. Russo 1 
Executive Director: T. Michael Elliott 1 

"voting member of the Board of Governors 
1 nonvoting member of the Board of Governors 

Board of Governors 

Term Expiring 1989: 

Bill D. Carroll, Lansing (Chip) Hatfield, 

Duncan H. Lawrie, David Pessel, 

Susan L. Rosenbaum, Sallie V. Sheppard, Bruce Shriver, 
Harold S. Stone, Akihiko Yamada, Marshall C. Yovits 

Term Expiring 1990: 

Vishwani Agrawal, Mario R. Barbacci, 

Ming T. (Mike) Liu, Yale N. Patt, Donald E. Thomas, 
Benjamin W. Wah, Ronald Waxman 

Term Expiring 1991: 

P. Bruce Berra, Paul L. Borriil, Michael Evangelist, 
Ted Lewis, Raymond E. Miller, 

Earl E. Swartzlander, Jr., Thomas W. Williams 

Next Board Meeting 

November 17, 1989, 8:30 a.m. 

Bally's Hotel, Reno, NV 

Senior Staff 

Executive Director: T. Michael Elliott 
Editor and Publisher: H. True Seaborn 
Publisher, Computer Society Press: Eugene M. Falken 
Director, Conferences and Tutorials: Anne Marie Kelly 
Director, Finance and Administration: Tod S. Heisler 
Assistant to the Executive Director: Violet S. Doan 

Computer Society Offices 

Headquarters Office 

1730 Massachusetts Ave. NW 
Washington, DC 20036-1903 
Phone (202) 371-0101 
Telex: 7108250437 IEEE COMPSO 
Fax:(202)728-9614 

Publications Office 

10662 Los Vaqueros Cir. 

Los Alamitos, CA 90720-2578 
Membership and General Information: (714) 821 -8380 
Publication Orders: (800) 272-6657 
Fax: (714) 821-4010 

European Office 

13, Ave. de L'Aquilon 
B-1200 Brussels, Belgium 
Phone: 32 (2) 770-21-98 
Fax: 32 (2) 770-85-05 

Asian Office 

Ooshima Building 
2-19-1 Minami-Aoyama, Minato-ku 
Tokyo 107,Japan 
Phone: 81 (3) 408-3118 
Fax: 81 (3) 408-3553 


Use the Reader Service Card to obtain information on: 

• Membership application—student #203, others #202 

• Perodicals subscription form for individuals #200 

• Periodicals subscription form for organizations #199 

• Publications catalog #201 

• Standards working groups list #195 

• Compmail+ international electronic mail/database brochure 
#194 

• Technical committee list/application #197 

• Chapters lists, start-up procedures—student/regular #193 

• Student scholarship information #192 

• Awards description/nomination forms #198 

• Volunteer leaders/staff directory #196 

• IEEE senior member application #204 

Purpose 

The IEEE Computer Society advances the theory and practice of 
computer science and engineering, promotes the exchange of 
technical information among 100,000 members worldwide, and 
provides a wide range of services to members and nonmembers. 

Membership 

Members receive the acclaimed monthly magazine Computer, 
discounts, and opportunities to serve (all activities are led by 
volunteer members). Membership is open to all IEEE members, 
affiliate society members, and others seriously interested in the 
computer field. 

Publications and Activities 

Computer. An authoritative, easy-to-read magazine containing 
tutorial and in-depth articles on topics across the computer field, plus 
news, conferences, calendar, interviews, and new products. 

Periodicals. The society publishes six magazines and four 
research transactions. Refer to membership application or request 
information as noted above. 

Conference Proceedings, Tutorial Texts, Standard 
Documents. The Computer Society Press publishes more than 100 
titles every year. 

Standards Working Groups. Over 100 of these groups produce 
IEEE standards used throughout the industrial world. 

Technical Committees. More than 30 TCs publish newsletters, 
provide interaction with peers in specialty areas, and directly 
influence standards, conferences, and education. 

Conferences/Education. The society holds about 100 
conferences each year and sponsors many educational activities, 
including computing science accreditation. 

Chapters. Regular and student chapters worldwide provide the 
opportunity to interact with colleagues, hear technical experts, and 
serve the local professional community. 

European Office 

Payments for Computer Society membership and publication 
orders are accepted by checks in Belgian, British, German, Swiss, or 
US currency. Checks in US funds must be drawn on a US bank. 
Payment may also be made by American Express, Diners Club, 
Eurocard, Master Card, or Visa credit cards. 

Asian Office 

Payments for Computer Society membership and publication 
orders are accepted by checks in Japanese or US currency. Checks 
in US funds must be drawn on a US bank. Payment may also be 
made by electronic fund transfer to the Bank of Tokyo, Akasaka 
Branch, Toza acct. 0767956; the credit receiver is the IEEE 
Computer Society Headquarters Office. Payment may also be made 
by American Express, Diners Club, Eurocard, Master Card, or Visa 
credit cards. 

Ombudsman 

Members experiencing problems — magazine delivery, 
membership status, or unresolved complaints — may write to the 
ombudsman at the Publications Office. 












Breakthrough in presentation of simulation results 
SIMSCRIPT II.5 with SIMGRAPHICS 
Now you see an animated picture of the system 



SIMGRAPHICS ™ menu builder Draw your own icons-or use ours 



FINAL RESULTS OF SIMULATION- 


DID PROG**! 

+ i 


Presentation graphics 



Communications- COMNETII.5® 



Anplitude us. Distance 





Transportation system 


SIMGRAPHICS advanced user interface 


Free trial and training 

See for yourself how simulation 
results are now easier to understand. 

The free trial contains everything 
you need to try SIMGRAPHICS™ 
on your computer. 

We send you SIMSCRIPT II.5, 
animated models, and complete 
documentation. You can build your 
own model or modify one of ours. 

Try the SIMSCRIPT II.5® lan¬ 
guage, the timeliness of our support, 
the accuracy of our documentation, 
and the facilities for error checking- 
everything you need for a successful 
project. 

For 26 years CACI has provided 
trial use of its simulation software- 

no cost, no obligation. 

Act now for free training 

For a limited time we will also in¬ 
clude free training. 

For immediate information 

Call Hal Duncan at (619) 457-9681, 
FAX (619) 457-1184. In Europe, call 
Richard Eve on (01) 528-7980, FAX 
(01) 528-7988. 


Rush information on SIMSCRIPT II.5 
with SIMGRAPHICS 

Limited offer-return the coupon today 
and we will also include one free course 
enrollment worth $950. 

□ Send information on your Special 
University Offer. 

Name 


Organization 


Address 


City State Zip 

Telephone 


Computer Operating System 

IEEE MICRO 

Return to: 

CACI Products Company 
3344 North Torrey Pines Court 
La Jolla, California 92037 

Call Hal Duncan at (619) 457-9681. 

FAX (619) 457-1184. 

| In Europe: 

I CACI Products Division 
Regent House, 89 Kingsway 
London, WC2B 6RH, United Kingdom 

Call Richard Eve on (01) 528-7980. 

I FAX (01) 528-7988. 

SIMSCRIPT II.5, NETWORK II.5, SIMFACTORY II.5, and 
SIMGRAPHICS are registered trademarks and service marks of 
CACI, INC. COMNET II.5 is a trademark and service mark of 
CACI, INC. ©1989 CACI, INC. 








































